mirror of
https://github.com/danderson/netboot.git
synced 2025-10-16 18:11:21 +02:00
pixiecore: copy READMEs over.
The documentation doesn't match the current code at all, but it's the target to aim for.
This commit is contained in:
parent
9619a5ab87
commit
c3052f317f
201
pixiecore/README.api.md
Normal file
201
pixiecore/README.api.md
Normal file
@ -0,0 +1,201 @@
|
||||
# API server
|
||||
|
||||
Pixiecore supports two modes of operation: static and API-driven.
|
||||
|
||||
In static mode, you just pass it a -kernel and a -initrd, and
|
||||
Pixiecore will boot any PXE client that it sees.
|
||||
|
||||
In API mode, requests made by PXE clients will be translated into
|
||||
calls to an HTTP API, essentially asking "Should I boot this client?
|
||||
If so, what should I boot it with?" This lets you implement fancy
|
||||
dynamic booting things, as well as construct per-machine commandlines
|
||||
and whatnot.
|
||||
|
||||
## API specification
|
||||
|
||||
Note that Pixiecore is the _client_ of this API. It's your job to
|
||||
implement it and point Pixiecore at the right URL prefix.
|
||||
|
||||
The API consists of a single endpoint:
|
||||
`<apiserver-prefix>/v1/boot/<mac-addr>`. Pixiecore calls this endpoint
|
||||
to learn whether/how to boot a machine with a given MAC address.
|
||||
|
||||
Any non-200 response from the server will cause Pixieboot to ignore
|
||||
the requesting machine.
|
||||
|
||||
A 200 response will cause Pixiecore to boot the requesting machine. A
|
||||
200 response must come with a JSON document conforming to the
|
||||
following specification, with **_italicized_** entries being optional:
|
||||
|
||||
- **kernel** (string): the URL of the kernel to boot.
|
||||
- **_initrd_** (list of strings): URLs of initrds to load. The kernel
|
||||
will flatten all the initrds into a single filesystem.
|
||||
- **_cmdline_** (object): commandline parameters for the kernel. Each
|
||||
key/value pair maps to key=value, where value can be:
|
||||
- **string**: the value is passed verbatim to the kernel
|
||||
- **true**: the value is omitted, only the key is passed to the
|
||||
kernel.
|
||||
- **object**: the value is a URL that Pixiecore will rewrite such
|
||||
that it proxies the request (see below for why you'd want that).
|
||||
- **url** (string): any URL. Pixiecore will rewrite the URL such
|
||||
that it proxies the request.
|
||||
- **_message_** (string): A message to display before booting the
|
||||
provided configuration. Note that displaying this message is on
|
||||
a _best-effort basis only_, as particular implementations of the
|
||||
boot process may not support displaying text.
|
||||
|
||||
Malformed 200 responses will have the same result as a non-200
|
||||
response - Pixiecore will ignore the requesting machine.
|
||||
|
||||
### Kernel, initrd and cmdline URLs
|
||||
|
||||
As described above, the kernel and initrds are specified as URLs,
|
||||
enabling you to host them as you please - you could even link directly
|
||||
to a distro's download links if you wanted.
|
||||
|
||||
URLs provided by the API server can be absolute, or just a naked
|
||||
path. In the latter case, the path is resolved with reference to the
|
||||
API server URL that Pixiecore is using - although note that the path
|
||||
is _not_ rooted within Pixiecore's API path. For example, if you
|
||||
provide `/foo` as a URL to Pixiecore running with `-api
|
||||
http://bar.com/baz`, Pixiecore will fetch `http://bar.com/foo`, _not_
|
||||
`http://bar.com/baz/foo`.
|
||||
|
||||
In addition to `http` and `https` URLs, Pixiecore supports `file://`
|
||||
URLs to serve files from the filesystem of the machine running
|
||||
Pixiecore. You can use this to host large OS images near the target
|
||||
machines, while still deciding what to boot from a central but remote
|
||||
location. Pixiecore uses the "path" segment of the URL, so all
|
||||
`file://` URLs are absolute filesystem paths.
|
||||
|
||||
Pixiecore will not point booting machines directly at the given
|
||||
URLs. Instead, it will point the booting machines to a proxy URL on
|
||||
Pixiecore's HTTP server, and proxy the transfer.
|
||||
|
||||
This is done for two reasons: one, the booting machine may be in a
|
||||
restricted network environment. For example, you may have a policy
|
||||
that machines must do 802.1x authentication to get full network
|
||||
access, else they get dropped on a "remediation" vlan. Proxying the
|
||||
downloads through Pixiecore means you need only one set of edge ACLs
|
||||
on the remediation vlan, regardless of _what_ you're booting: just
|
||||
whitelist Pixiecore's IP:port, and from there your API server can boot
|
||||
whatever you want.
|
||||
|
||||
Second, the booting machine is limited to using HTTP to fetch
|
||||
images. This is probably okay (though not ideal, admittedly - but then
|
||||
again, PXE forces us to TFTP anyway, so we're already screwed for
|
||||
security) on the machine's local ethernet broadcast domain, but is
|
||||
definitely not okay for retrieval over the internet. Proxying through
|
||||
Pixiecore means that your API server can provide HTTPS URLs, and
|
||||
everything but the very last mile between Pixiecore and the machine
|
||||
will be secure.
|
||||
|
||||
The exact URLs visible to the booting machine are an implementation
|
||||
detail of Pixiecore and are subject to breaking change at any
|
||||
time.
|
||||
|
||||
For the curious, the current implementation translates API server
|
||||
provided URLs into `<pixiecore HTTP endpoint>/f/<signed URL
|
||||
blob>`. The signed URL blob is a base64-encoding of running NaCL's
|
||||
secretbox authenticated encryption function over the server-provided
|
||||
URL, using an ephemeral key generated when Pixiecore starts. This
|
||||
steers the booting machine through Pixiecore for the fetch, and lets
|
||||
Pixiecore verify that it's only proxying for URLs that the API server
|
||||
gave it, so it's not an open proxy on your remediation vlan.
|
||||
|
||||
### Multiple calls
|
||||
|
||||
Pixiecore in API mode is stateless. Due to the unique way that PXE
|
||||
works, the API server may receive multiple requests for a single
|
||||
machine boot. Unfortunately, there is no good way to reliably provide
|
||||
a 1:1 mapping between a machine boot and an API server request.
|
||||
|
||||
If you want to implement "single-shot" boot behavior (i.e. "netboot
|
||||
this MAC once, then go back to ignoring it"), you'll need to add a
|
||||
signalling backchannel to the OS image, so that it signals your API
|
||||
server when it's booted. Responding only to the first request for a
|
||||
MAC address will not have the desired effect.
|
||||
|
||||
### Example responses
|
||||
|
||||
Boot into CoreOS stable. **WARNING**: this example is **unsafe**,
|
||||
because the images are linked to over HTTP, and we're not doing GPG
|
||||
verification of the image signatures. This is an example only.
|
||||
|
||||
```json
|
||||
{
|
||||
"kernel": "http://stable.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz",
|
||||
"initrd": ["http://stable.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz"]
|
||||
}
|
||||
```
|
||||
|
||||
Boot from API server provided files. Pixiecore will grab kernel and
|
||||
initrd from `<apiserver-host>/kernel` and `<apiserver-host>/initrd.[01]`.
|
||||
|
||||
```json
|
||||
{
|
||||
"kernel": "/kernel",
|
||||
"initrd": ["/initrd.0", "/initrd.1"]
|
||||
}
|
||||
```
|
||||
|
||||
Boot from HTTPS, with extra commandline flags.
|
||||
|
||||
```json
|
||||
{
|
||||
"kernel": "https://files.local/kernel",
|
||||
"initrd": ["https://files.local/initrd"],
|
||||
"cmdline": {
|
||||
"selinux": "1",
|
||||
"coreos.autologin": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Boot from Pixiecore's local filesystem.
|
||||
|
||||
```json
|
||||
{
|
||||
"kernel": "file:///mnt/data/kernel",
|
||||
"initrd": ["file:///mnt/data/initrd"],
|
||||
}
|
||||
```
|
||||
|
||||
Provide a proxied cloud-config and an unproxied other URL.
|
||||
|
||||
```json
|
||||
{
|
||||
"kernel": "https://files.local/kernel",
|
||||
"initrd": ["https://files.local/initrd"],
|
||||
"cmdline": {
|
||||
"cloud-config-url": {
|
||||
"url": "https://files.local/cloud-config"
|
||||
},
|
||||
"non-proxied-url": "https://files.local/something-else"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example API server
|
||||
|
||||
There is a very small example API server implementation in the
|
||||
`example` subdirectory. This sample server is not production-quality
|
||||
code (e.g. it uses panic for error handling), but should be a
|
||||
reasonable starting point nonetheless. It implements a reduced form of
|
||||
Pixiecore's static mode: you give it a kernel, initrd and commandline
|
||||
as flags, and it serves those for all boot requests it
|
||||
receives. Unlike Pixiecore's builtin static mode, the sample server
|
||||
can only boot one initrd image.
|
||||
|
||||
## Deprecated features
|
||||
|
||||
### Kernel commandline as a string
|
||||
|
||||
The `cmdline` parameter returned by the API server can also be a plain
|
||||
string instead of an object. That string is the full verbatim
|
||||
commandline to be passed to the booting kernel.
|
||||
|
||||
This form was replaced by the object form to allow Pixiecore to do
|
||||
additional processing of the commandline before passing it to the
|
||||
booting kernel - specifically to allow for URL translation and
|
||||
proxying.
|
282
pixiecore/README.md
Normal file
282
pixiecore/README.md
Normal file
@ -0,0 +1,282 @@
|
||||
NOTE THAT THIS IS NOT YET READY AT ALL, THIS README WAS JUST COPIED
|
||||
OVER TO THIS NEW VERSION, THE CODE SUPPORTS NONE OF THIS RIGHT
|
||||
NOW. Come back later if you want working software.
|
||||
|
||||
[](https://github.com/google/netboot/pixiecore)
|
||||
[](https://github.com/google/netboot/pixiecore)
|
||||
[](https://github.com/google/netboot/pixiecore)
|
||||
|
||||
# Pixiecore, PXE booting for people in a hurry
|
||||
|
||||
```
|
||||
There once was a protocol called PXE,
|
||||
Whose specification was overly tricksy.
|
||||
A committee refined it
|
||||
Into a big Turing tarpit,
|
||||
And now you're using it to boot your PC.
|
||||
```
|
||||
|
||||
Booting a Linux system over the network is quite tedious. You have to
|
||||
set up a TFTP server, configure your DHCP server to recognize PXE
|
||||
clients, and send them the right set of magical options to get them to
|
||||
boot, often fighting rubbish PXE ROM implementations.
|
||||
|
||||
Pixiecore aims to simplify this process, by packing the whole process
|
||||
into a single binary that can cooperate with your network's existing
|
||||
DHCP server.
|
||||
|
||||
Pixiecore can be used either as a simple "just boot into this OS
|
||||
image" tool, or as a building block of a machine management system
|
||||
with its API mode.
|
||||
|
||||
[](https://travis-ci.org/danderson/pixiecore)
|
||||
|
||||
## Pixiecore in static mode ("I just want to boot 5 machines")
|
||||
|
||||
Run the pixiecore binary, passing it a kernel and initrd, and
|
||||
optionally some extra kernel commandline arguments.
|
||||
|
||||
Here's a couple of examples. If you feel like a screencast instead,
|
||||
there's a
|
||||
[very short demo](https://www.youtube.com/watch?v=xjdTOt5YDQM).
|
||||
|
||||
### Tiny Core Linux
|
||||
|
||||
Tiny Core Linux is a positively tiny distro, clocking in at 10M in the
|
||||
configuration we'll be using (it can go lower than that). Let's set
|
||||
ourselves up such that any PXE booting machine on the network boots
|
||||
into a TinyCore ramdisk:
|
||||
|
||||
```shell
|
||||
# Fetch the kernel and the 2 cpio files that form the filesystem.
|
||||
wget http://tinycorelinux.net/7.x/x86/release/distribution_files/{vmlinuz64,modules64.gz,rootfs.gz}
|
||||
|
||||
# In the real world, you would AUTHENTICATE YOUR DOWNLOADS here. TCL sadly
|
||||
# only distributes images over HTTP, so it's anyone's guess what you
|
||||
# just downloaded.
|
||||
|
||||
# Go!
|
||||
pixiecore -kernel vmlinuz64 -initrd rootfs.gz,core.gz,modules64.gz
|
||||
```
|
||||
|
||||
That's it. Any machine that tries to netboot on this network will now
|
||||
boot into a TinyCore ramdisk.
|
||||
|
||||
Notice that we passed multiple cpio archives to `-initrd`. All
|
||||
provided archives will be merged on boot to form the final
|
||||
ramdisk. This is quite handy for things like providing OEM
|
||||
configuration without having to respin the upstream initrd image.
|
||||
|
||||
### CoreOS
|
||||
|
||||
Pixiecore was originally written as a component in an automated
|
||||
installation system for CoreOS on bare metal. For this example, let's
|
||||
set up a netboot for the alpha CoreOS release:
|
||||
|
||||
```shell
|
||||
# Grab the PXE images and verify them
|
||||
wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz
|
||||
wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz
|
||||
|
||||
# In the real world, you would AUTHENTICATE YOUR DOWNLOADS
|
||||
# here. CoreOS distributes image signatures, but that only really
|
||||
# helps if you already know the right GPG key.
|
||||
|
||||
# Go!
|
||||
pixiecore -kernel coreos_production_pxe.vmlinuz -initrd coreos_production_pxe_image.cpio.gz --cmdline coreos.autologin
|
||||
```
|
||||
|
||||
Notice that we're passing an extra commandline argument to make CoreOS
|
||||
automatically log in once it's booted.
|
||||
|
||||
## Pixiecore in API mode
|
||||
|
||||
Think of Pixiecore in API mode as a "PXE to HTTP" translator. Whenever
|
||||
Pixiecore sees a machine trying to PXE boot, it will ask a remote HTTP
|
||||
API (which you implement) what to do. The API server can tell
|
||||
Pixiecore to ignore the machine, or tell it to boot into a given
|
||||
kernel/initrd/commandline.
|
||||
|
||||
Effectively, Pixiecore in API mode lets you pretend that your machines
|
||||
speak a simple JSON protocol when trying to netboot. This makes it
|
||||
_far_ easier to play with netbooting in your own software.
|
||||
|
||||
To start Pixiecore in API mode, pass it the HTTP API endpoint through
|
||||
the `-api` flag. The endpoint you provide must implement the Pixiecore
|
||||
boot API, as described in the [API spec](README.api.md).
|
||||
|
||||
You can find a sample API server implementation in the `example`
|
||||
subdirectory. The code is not production-grade, but gives a short
|
||||
illustration of how the protocol works by reimplementing a subset of
|
||||
Pixiecore's static mode as an API server.
|
||||
|
||||
## Running in Docker
|
||||
|
||||
Pixiecore is available as a Docker image called
|
||||
`danderson/pixiecore`. It's an automatic Docker Hub build that tracks
|
||||
the repository.
|
||||
|
||||
Because Pixiecore needs to listen for DHCP traffic, it has to run with
|
||||
the host network stack.
|
||||
|
||||
```shell
|
||||
sudo docker run -v .:/image --net=host danderson/pixiecore -kernel /image/coreos_production_pxe.vmlinuz -initrd /image/coreos_production_pxe_image.cpio.gz
|
||||
```
|
||||
|
||||
## How it works
|
||||
|
||||
Pixiecore implements four different, but related protocols in one
|
||||
binary, which together can take a PXE ROM from nothing to booting
|
||||
Linux. They are: ProxyDHCP, PXE, TFTP, and HTTP. Let's walk through
|
||||
the boot process for a PXE ROM.
|
||||
|
||||
### DHCP/ProxyDHCP
|
||||
|
||||
The first thing a PXE ROM does is request a configuration through
|
||||
DHCP, waiting for a DHCP reply that includes PXE vendor options. The
|
||||
normal way of providing these options is to edit your DHCP server's
|
||||
configuration to provide them to clients that identify themselves as
|
||||
PXE clients. Unfortunately, reconfiguring your network's DHCP server
|
||||
is tedious at best, and impossible if you DHCP server is built into a
|
||||
consumer router, or managed by someone else.
|
||||
|
||||
Pixiecore instead uses a feature of the PXE specification called
|
||||
_ProxyDHCP_. As you might guess from the name, ProxyDHCP is not a
|
||||
proxy at all (yeah, the PXE spec is like that), but a second DHCP
|
||||
server that only provides PXE configuration.
|
||||
|
||||
When the PXE ROM sends out a `DHCPDISCOVER`, it gets two replies back:
|
||||
one containing network configuration from the primary DHCP server, and
|
||||
one containing only PXE DHCP options from the ProxyDHCP server. The
|
||||
PXE firmware combines the two, and continues as if the primary server
|
||||
had provided all the configuration.
|
||||
|
||||
### PXE
|
||||
|
||||
In theory, you'd expect the ProxyDHCP server to just provide a TFTP
|
||||
server IP and a filename to the PXE firmware, and it would proceed to
|
||||
download and boot that just like the BOOTP of old.
|
||||
|
||||
Sadly, the average quality of PXE ROM implementations is abysmal, and
|
||||
many of them fail to chainload correctly if you try to do this from a
|
||||
ProxyDHCP server.
|
||||
|
||||
So, instead, we make use of the spec's "PXE menu" functionality, which
|
||||
lets you tell the PXE firmware to display a boot menu. Just like
|
||||
everything else in PXE, this is quite brittle, so nobody actually uses
|
||||
it to display menus - instead, they just push a more fully featured
|
||||
bootloader over PXE, and let that bootloader do the fancy work.
|
||||
|
||||
However, PXE menus seem to work reliably when combined with
|
||||
ProxyDHCP... And the PXE configuration can provide a timeout after
|
||||
which the first menu entry is booted... And that timeout can be set to
|
||||
zero.
|
||||
|
||||
So, we can just provide a single-entry menu, with a zero timeout, and
|
||||
chainload that way! But wait, there's more terribleness. PXE menu
|
||||
entries don't just list a TFTP server and file to load, because that
|
||||
would be too simple. Instead, each menu entry maps to a "Boot Server
|
||||
Type", and yet another DHCP option maps that boot server type to a set
|
||||
of IP addresses.
|
||||
|
||||
Those IP addresses aren't TFTP servers, but PXE boot servers. PXE boot
|
||||
servers listen on port 4011. They use the DHCP packet format, but only
|
||||
as a way of conveying a DHCP option that says "please tell me how to
|
||||
boot the following Boot Server Type". It's quite possibly the least
|
||||
efficient protocol encoding ever devised.
|
||||
|
||||
At long last, when the PXE server receives that request, it can reply
|
||||
with a BOOTP-ish packet that specified next-server and a filename. And
|
||||
_those_ are, at long last, TFTP.
|
||||
|
||||
### TFTP
|
||||
|
||||
After navigating the eldritch horror of PXE, TFTP is a breath of fresh
|
||||
air. It is indeed a trivial protocol for transferring files. I have
|
||||
found some PXE ROMs that manage to add unnecessary complexity even to
|
||||
that, but by and large, this step is straightforward.
|
||||
|
||||
However, TFTP is quite slow, because it doesn't support transfer
|
||||
windows (well, it does, but it's an extension defined in an RFC
|
||||
published in 2015, so guess how many PXE ROMs implement it...). As a
|
||||
result, you must pay one round-trip per ~1500 bytes transferred, and
|
||||
even on a gigabit network, that slows things down.
|
||||
|
||||
Given that some netboot images are quite large (CoreOS clocks in at
|
||||
almost 200MB), what we really want is to switch to a more efficient
|
||||
protocol. That's where PXELINUX comes in.
|
||||
|
||||
PXELINUX is a small bootloader that knows how to boot Linux kernels,
|
||||
and it comes in a variant that can speak HTTP. PXELINUX is 90kB, which
|
||||
even over TFTP is very fast to transfer.
|
||||
|
||||
Thus, Pixiecore uses TFTP only to transfer PXELINUX, and from there
|
||||
steers it to HTTP for the rest of the loading process.
|
||||
|
||||
### HTTP
|
||||
|
||||
We've finally crawled our way up to the late nineties - we can speak
|
||||
HTTP! Pixiecore's HTTP server is wonderfully familiar and normal. It
|
||||
just serves up a support file that PXELINUX needs (`ldlinux.c32`), a
|
||||
trivial PXELINUX configuration telling it to boot a Linux kernel, and
|
||||
the user-provided kernel and initrd files.
|
||||
|
||||
PXELINUX grabs all of that, and finally, Linux boots.
|
||||
|
||||
### Recap
|
||||
|
||||
This is what the whole boot process looks like on the wire.
|
||||
|
||||
#### Dramatis Personae
|
||||
|
||||
- **PXE ROM**, a brittle firmware burned into the network card.
|
||||
- **DHCP server**, a plain old DHCP server providing network configuration.
|
||||
- **Pixieboot**, the Hero and server of ProxyDHCP, PXE, TFTP and HTTP.
|
||||
- **PXELINUX**, an open source bootloader of the [Syslinux project](http://www.syslinux.org).
|
||||
|
||||
#### Timeline
|
||||
|
||||
- PXE ROM starts, broadcasts `DHCPDISCOVER`.
|
||||
- DHCP server responds with a `DHCPOFFER` containing network configs.
|
||||
- Pixiecore's ProxyDHCP server responds with a `DHCPOFFER` containing a PXE boot menu.
|
||||
- PXE ROM does a `DHCPREQUEST`/`DHCPACK` exchange with the DHCP server to get a network configuration.
|
||||
- PXE ROM processes the PXE boot menu, decides to boot menu entry 0.
|
||||
- PXE ROM sends a `DHCPREQUEST` to Pixiecore's PXE server, asking for a boot file.
|
||||
- Pixiecore's PXE server responds with a `DHCPACK` listing a TFTP
|
||||
server, a boot filename, and a PXELINUX vendor option to make it use
|
||||
HTTP.
|
||||
- PXE ROM downloads PXELINUX from Pixiecore's TFTP server, and hands off to PXELINUX.
|
||||
- PXELINUX fetches its configuration from Pixiecore's HTTP server.
|
||||
- PXELINUX fetches a kernel and ramdisk from Pixiecore's HTTP server, and boots Linux.
|
||||
|
||||
## Known deviations from specifications
|
||||
|
||||
Pixiecore aims to be compliant with the relevant specifications for
|
||||
TFTP, DHCP, and PXE. This section lists the places where Pixiecore
|
||||
deliberately deviates from the spec to support buggy clients.
|
||||
|
||||
### Missing Client Machine Identifier (GUID) option
|
||||
|
||||
Some PXE ROMs don't send DHCP option 97, "Client Machine Identifier
|
||||
(GUID)", in their DHCP and PXE requests. According to the PXE 2.1
|
||||
specification and RFC 4578, this makes the requests non-compliant:
|
||||
|
||||
> This option MUST be present in all DHCP and PXE packets sent by PXE-compliant clients and servers.
|
||||
|
||||
Pixiecore's behavior implements "SHOULD" instead of "MUST": if a
|
||||
client request has a GUID, Pixiecore's response will respond with a
|
||||
GUID. If the client request has no GUID, Pixiecore omits option 97 in
|
||||
its response.
|
||||
|
||||
## Development
|
||||
|
||||
You can use [Vagrant](https://www.vagrantup.com/) to quickly setup a test environment:
|
||||
|
||||
(HOST)$ vagrant up --provider=libvirt pxeserver
|
||||
(HOST)$ vagrant ssh pxeserver
|
||||
(PXESERVER)$ wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz
|
||||
(PXESERVER)$ wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz
|
||||
(PXESERVER)$ pixiecore -debug -kernel coreos_production_pxe.vmlinuz -initrd coreos_production_pxe_image.cpio.gz --cmdline coreos.autologin
|
||||
### In another terminal
|
||||
(HOST)$ vagrant up --provider=libvirt pxeclient1
|
||||
|
Loading…
x
Reference in New Issue
Block a user