pixiecore: copy READMEs over.

The documentation doesn't match the current code at all, but it's
the target to aim for.
This commit is contained in:
David Anderson 2016-08-09 00:32:45 -07:00
parent 9619a5ab87
commit c3052f317f
2 changed files with 483 additions and 0 deletions

201
pixiecore/README.api.md Normal file
View File

@ -0,0 +1,201 @@
# API server
Pixiecore supports two modes of operation: static and API-driven.
In static mode, you just pass it a -kernel and a -initrd, and
Pixiecore will boot any PXE client that it sees.
In API mode, requests made by PXE clients will be translated into
calls to an HTTP API, essentially asking "Should I boot this client?
If so, what should I boot it with?" This lets you implement fancy
dynamic booting things, as well as construct per-machine commandlines
and whatnot.
## API specification
Note that Pixiecore is the _client_ of this API. It's your job to
implement it and point Pixiecore at the right URL prefix.
The API consists of a single endpoint:
`<apiserver-prefix>/v1/boot/<mac-addr>`. Pixiecore calls this endpoint
to learn whether/how to boot a machine with a given MAC address.
Any non-200 response from the server will cause Pixieboot to ignore
the requesting machine.
A 200 response will cause Pixiecore to boot the requesting machine. A
200 response must come with a JSON document conforming to the
following specification, with **_italicized_** entries being optional:
- **kernel** (string): the URL of the kernel to boot.
- **_initrd_** (list of strings): URLs of initrds to load. The kernel
will flatten all the initrds into a single filesystem.
- **_cmdline_** (object): commandline parameters for the kernel. Each
key/value pair maps to key=value, where value can be:
- **string**: the value is passed verbatim to the kernel
- **true**: the value is omitted, only the key is passed to the
kernel.
- **object**: the value is a URL that Pixiecore will rewrite such
that it proxies the request (see below for why you'd want that).
- **url** (string): any URL. Pixiecore will rewrite the URL such
that it proxies the request.
- **_message_** (string): A message to display before booting the
provided configuration. Note that displaying this message is on
a _best-effort basis only_, as particular implementations of the
boot process may not support displaying text.
Malformed 200 responses will have the same result as a non-200
response - Pixiecore will ignore the requesting machine.
### Kernel, initrd and cmdline URLs
As described above, the kernel and initrds are specified as URLs,
enabling you to host them as you please - you could even link directly
to a distro's download links if you wanted.
URLs provided by the API server can be absolute, or just a naked
path. In the latter case, the path is resolved with reference to the
API server URL that Pixiecore is using - although note that the path
is _not_ rooted within Pixiecore's API path. For example, if you
provide `/foo` as a URL to Pixiecore running with `-api
http://bar.com/baz`, Pixiecore will fetch `http://bar.com/foo`, _not_
`http://bar.com/baz/foo`.
In addition to `http` and `https` URLs, Pixiecore supports `file://`
URLs to serve files from the filesystem of the machine running
Pixiecore. You can use this to host large OS images near the target
machines, while still deciding what to boot from a central but remote
location. Pixiecore uses the "path" segment of the URL, so all
`file://` URLs are absolute filesystem paths.
Pixiecore will not point booting machines directly at the given
URLs. Instead, it will point the booting machines to a proxy URL on
Pixiecore's HTTP server, and proxy the transfer.
This is done for two reasons: one, the booting machine may be in a
restricted network environment. For example, you may have a policy
that machines must do 802.1x authentication to get full network
access, else they get dropped on a "remediation" vlan. Proxying the
downloads through Pixiecore means you need only one set of edge ACLs
on the remediation vlan, regardless of _what_ you're booting: just
whitelist Pixiecore's IP:port, and from there your API server can boot
whatever you want.
Second, the booting machine is limited to using HTTP to fetch
images. This is probably okay (though not ideal, admittedly - but then
again, PXE forces us to TFTP anyway, so we're already screwed for
security) on the machine's local ethernet broadcast domain, but is
definitely not okay for retrieval over the internet. Proxying through
Pixiecore means that your API server can provide HTTPS URLs, and
everything but the very last mile between Pixiecore and the machine
will be secure.
The exact URLs visible to the booting machine are an implementation
detail of Pixiecore and are subject to breaking change at any
time.
For the curious, the current implementation translates API server
provided URLs into `<pixiecore HTTP endpoint>/f/<signed URL
blob>`. The signed URL blob is a base64-encoding of running NaCL's
secretbox authenticated encryption function over the server-provided
URL, using an ephemeral key generated when Pixiecore starts. This
steers the booting machine through Pixiecore for the fetch, and lets
Pixiecore verify that it's only proxying for URLs that the API server
gave it, so it's not an open proxy on your remediation vlan.
### Multiple calls
Pixiecore in API mode is stateless. Due to the unique way that PXE
works, the API server may receive multiple requests for a single
machine boot. Unfortunately, there is no good way to reliably provide
a 1:1 mapping between a machine boot and an API server request.
If you want to implement "single-shot" boot behavior (i.e. "netboot
this MAC once, then go back to ignoring it"), you'll need to add a
signalling backchannel to the OS image, so that it signals your API
server when it's booted. Responding only to the first request for a
MAC address will not have the desired effect.
### Example responses
Boot into CoreOS stable. **WARNING**: this example is **unsafe**,
because the images are linked to over HTTP, and we're not doing GPG
verification of the image signatures. This is an example only.
```json
{
"kernel": "http://stable.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz",
"initrd": ["http://stable.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz"]
}
```
Boot from API server provided files. Pixiecore will grab kernel and
initrd from `<apiserver-host>/kernel` and `<apiserver-host>/initrd.[01]`.
```json
{
"kernel": "/kernel",
"initrd": ["/initrd.0", "/initrd.1"]
}
```
Boot from HTTPS, with extra commandline flags.
```json
{
"kernel": "https://files.local/kernel",
"initrd": ["https://files.local/initrd"],
"cmdline": {
"selinux": "1",
"coreos.autologin": true
}
}
```
Boot from Pixiecore's local filesystem.
```json
{
"kernel": "file:///mnt/data/kernel",
"initrd": ["file:///mnt/data/initrd"],
}
```
Provide a proxied cloud-config and an unproxied other URL.
```json
{
"kernel": "https://files.local/kernel",
"initrd": ["https://files.local/initrd"],
"cmdline": {
"cloud-config-url": {
"url": "https://files.local/cloud-config"
},
"non-proxied-url": "https://files.local/something-else"
}
}
```
### Example API server
There is a very small example API server implementation in the
`example` subdirectory. This sample server is not production-quality
code (e.g. it uses panic for error handling), but should be a
reasonable starting point nonetheless. It implements a reduced form of
Pixiecore's static mode: you give it a kernel, initrd and commandline
as flags, and it serves those for all boot requests it
receives. Unlike Pixiecore's builtin static mode, the sample server
can only boot one initrd image.
## Deprecated features
### Kernel commandline as a string
The `cmdline` parameter returned by the API server can also be a plain
string instead of an object. That string is the full verbatim
commandline to be passed to the booting kernel.
This form was replaced by the object form to allow Pixiecore to do
additional processing of the commandline before passing it to the
booting kernel - specifically to allow for URL translation and
proxying.

282
pixiecore/README.md Normal file
View File

@ -0,0 +1,282 @@
NOTE THAT THIS IS NOT YET READY AT ALL, THIS README WAS JUST COPIED
OVER TO THIS NEW VERSION, THE CODE SUPPORTS NONE OF THIS RIGHT
NOW. Come back later if you want working software.
[![software](https://img.shields.io/badge/software-unstable-red.svg)](https://github.com/google/netboot/pixiecore)
[![software2](https://img.shields.io/badge/software-unready-red.svg)](https://github.com/google/netboot/pixiecore)
[![production](https://img.shields.io/badge/production-avoid-red.svg)](https://github.com/google/netboot/pixiecore)
# Pixiecore, PXE booting for people in a hurry
```
There once was a protocol called PXE,
Whose specification was overly tricksy.
A committee refined it
Into a big Turing tarpit,
And now you're using it to boot your PC.
```
Booting a Linux system over the network is quite tedious. You have to
set up a TFTP server, configure your DHCP server to recognize PXE
clients, and send them the right set of magical options to get them to
boot, often fighting rubbish PXE ROM implementations.
Pixiecore aims to simplify this process, by packing the whole process
into a single binary that can cooperate with your network's existing
DHCP server.
Pixiecore can be used either as a simple "just boot into this OS
image" tool, or as a building block of a machine management system
with its API mode.
[![Build Status](https://travis-ci.org/danderson/pixiecore.svg?branch=master)](https://travis-ci.org/danderson/pixiecore)
## Pixiecore in static mode ("I just want to boot 5 machines")
Run the pixiecore binary, passing it a kernel and initrd, and
optionally some extra kernel commandline arguments.
Here's a couple of examples. If you feel like a screencast instead,
there's a
[very short demo](https://www.youtube.com/watch?v=xjdTOt5YDQM).
### Tiny Core Linux
Tiny Core Linux is a positively tiny distro, clocking in at 10M in the
configuration we'll be using (it can go lower than that). Let's set
ourselves up such that any PXE booting machine on the network boots
into a TinyCore ramdisk:
```shell
# Fetch the kernel and the 2 cpio files that form the filesystem.
wget http://tinycorelinux.net/7.x/x86/release/distribution_files/{vmlinuz64,modules64.gz,rootfs.gz}
# In the real world, you would AUTHENTICATE YOUR DOWNLOADS here. TCL sadly
# only distributes images over HTTP, so it's anyone's guess what you
# just downloaded.
# Go!
pixiecore -kernel vmlinuz64 -initrd rootfs.gz,core.gz,modules64.gz
```
That's it. Any machine that tries to netboot on this network will now
boot into a TinyCore ramdisk.
Notice that we passed multiple cpio archives to `-initrd`. All
provided archives will be merged on boot to form the final
ramdisk. This is quite handy for things like providing OEM
configuration without having to respin the upstream initrd image.
### CoreOS
Pixiecore was originally written as a component in an automated
installation system for CoreOS on bare metal. For this example, let's
set up a netboot for the alpha CoreOS release:
```shell
# Grab the PXE images and verify them
wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz
wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz
# In the real world, you would AUTHENTICATE YOUR DOWNLOADS
# here. CoreOS distributes image signatures, but that only really
# helps if you already know the right GPG key.
# Go!
pixiecore -kernel coreos_production_pxe.vmlinuz -initrd coreos_production_pxe_image.cpio.gz --cmdline coreos.autologin
```
Notice that we're passing an extra commandline argument to make CoreOS
automatically log in once it's booted.
## Pixiecore in API mode
Think of Pixiecore in API mode as a "PXE to HTTP" translator. Whenever
Pixiecore sees a machine trying to PXE boot, it will ask a remote HTTP
API (which you implement) what to do. The API server can tell
Pixiecore to ignore the machine, or tell it to boot into a given
kernel/initrd/commandline.
Effectively, Pixiecore in API mode lets you pretend that your machines
speak a simple JSON protocol when trying to netboot. This makes it
_far_ easier to play with netbooting in your own software.
To start Pixiecore in API mode, pass it the HTTP API endpoint through
the `-api` flag. The endpoint you provide must implement the Pixiecore
boot API, as described in the [API spec](README.api.md).
You can find a sample API server implementation in the `example`
subdirectory. The code is not production-grade, but gives a short
illustration of how the protocol works by reimplementing a subset of
Pixiecore's static mode as an API server.
## Running in Docker
Pixiecore is available as a Docker image called
`danderson/pixiecore`. It's an automatic Docker Hub build that tracks
the repository.
Because Pixiecore needs to listen for DHCP traffic, it has to run with
the host network stack.
```shell
sudo docker run -v .:/image --net=host danderson/pixiecore -kernel /image/coreos_production_pxe.vmlinuz -initrd /image/coreos_production_pxe_image.cpio.gz
```
## How it works
Pixiecore implements four different, but related protocols in one
binary, which together can take a PXE ROM from nothing to booting
Linux. They are: ProxyDHCP, PXE, TFTP, and HTTP. Let's walk through
the boot process for a PXE ROM.
### DHCP/ProxyDHCP
The first thing a PXE ROM does is request a configuration through
DHCP, waiting for a DHCP reply that includes PXE vendor options. The
normal way of providing these options is to edit your DHCP server's
configuration to provide them to clients that identify themselves as
PXE clients. Unfortunately, reconfiguring your network's DHCP server
is tedious at best, and impossible if you DHCP server is built into a
consumer router, or managed by someone else.
Pixiecore instead uses a feature of the PXE specification called
_ProxyDHCP_. As you might guess from the name, ProxyDHCP is not a
proxy at all (yeah, the PXE spec is like that), but a second DHCP
server that only provides PXE configuration.
When the PXE ROM sends out a `DHCPDISCOVER`, it gets two replies back:
one containing network configuration from the primary DHCP server, and
one containing only PXE DHCP options from the ProxyDHCP server. The
PXE firmware combines the two, and continues as if the primary server
had provided all the configuration.
### PXE
In theory, you'd expect the ProxyDHCP server to just provide a TFTP
server IP and a filename to the PXE firmware, and it would proceed to
download and boot that just like the BOOTP of old.
Sadly, the average quality of PXE ROM implementations is abysmal, and
many of them fail to chainload correctly if you try to do this from a
ProxyDHCP server.
So, instead, we make use of the spec's "PXE menu" functionality, which
lets you tell the PXE firmware to display a boot menu. Just like
everything else in PXE, this is quite brittle, so nobody actually uses
it to display menus - instead, they just push a more fully featured
bootloader over PXE, and let that bootloader do the fancy work.
However, PXE menus seem to work reliably when combined with
ProxyDHCP... And the PXE configuration can provide a timeout after
which the first menu entry is booted... And that timeout can be set to
zero.
So, we can just provide a single-entry menu, with a zero timeout, and
chainload that way! But wait, there's more terribleness. PXE menu
entries don't just list a TFTP server and file to load, because that
would be too simple. Instead, each menu entry maps to a "Boot Server
Type", and yet another DHCP option maps that boot server type to a set
of IP addresses.
Those IP addresses aren't TFTP servers, but PXE boot servers. PXE boot
servers listen on port 4011. They use the DHCP packet format, but only
as a way of conveying a DHCP option that says "please tell me how to
boot the following Boot Server Type". It's quite possibly the least
efficient protocol encoding ever devised.
At long last, when the PXE server receives that request, it can reply
with a BOOTP-ish packet that specified next-server and a filename. And
_those_ are, at long last, TFTP.
### TFTP
After navigating the eldritch horror of PXE, TFTP is a breath of fresh
air. It is indeed a trivial protocol for transferring files. I have
found some PXE ROMs that manage to add unnecessary complexity even to
that, but by and large, this step is straightforward.
However, TFTP is quite slow, because it doesn't support transfer
windows (well, it does, but it's an extension defined in an RFC
published in 2015, so guess how many PXE ROMs implement it...). As a
result, you must pay one round-trip per ~1500 bytes transferred, and
even on a gigabit network, that slows things down.
Given that some netboot images are quite large (CoreOS clocks in at
almost 200MB), what we really want is to switch to a more efficient
protocol. That's where PXELINUX comes in.
PXELINUX is a small bootloader that knows how to boot Linux kernels,
and it comes in a variant that can speak HTTP. PXELINUX is 90kB, which
even over TFTP is very fast to transfer.
Thus, Pixiecore uses TFTP only to transfer PXELINUX, and from there
steers it to HTTP for the rest of the loading process.
### HTTP
We've finally crawled our way up to the late nineties - we can speak
HTTP! Pixiecore's HTTP server is wonderfully familiar and normal. It
just serves up a support file that PXELINUX needs (`ldlinux.c32`), a
trivial PXELINUX configuration telling it to boot a Linux kernel, and
the user-provided kernel and initrd files.
PXELINUX grabs all of that, and finally, Linux boots.
### Recap
This is what the whole boot process looks like on the wire.
#### Dramatis Personae
- **PXE ROM**, a brittle firmware burned into the network card.
- **DHCP server**, a plain old DHCP server providing network configuration.
- **Pixieboot**, the Hero and server of ProxyDHCP, PXE, TFTP and HTTP.
- **PXELINUX**, an open source bootloader of the [Syslinux project](http://www.syslinux.org).
#### Timeline
- PXE ROM starts, broadcasts `DHCPDISCOVER`.
- DHCP server responds with a `DHCPOFFER` containing network configs.
- Pixiecore's ProxyDHCP server responds with a `DHCPOFFER` containing a PXE boot menu.
- PXE ROM does a `DHCPREQUEST`/`DHCPACK` exchange with the DHCP server to get a network configuration.
- PXE ROM processes the PXE boot menu, decides to boot menu entry 0.
- PXE ROM sends a `DHCPREQUEST` to Pixiecore's PXE server, asking for a boot file.
- Pixiecore's PXE server responds with a `DHCPACK` listing a TFTP
server, a boot filename, and a PXELINUX vendor option to make it use
HTTP.
- PXE ROM downloads PXELINUX from Pixiecore's TFTP server, and hands off to PXELINUX.
- PXELINUX fetches its configuration from Pixiecore's HTTP server.
- PXELINUX fetches a kernel and ramdisk from Pixiecore's HTTP server, and boots Linux.
## Known deviations from specifications
Pixiecore aims to be compliant with the relevant specifications for
TFTP, DHCP, and PXE. This section lists the places where Pixiecore
deliberately deviates from the spec to support buggy clients.
### Missing Client Machine Identifier (GUID) option
Some PXE ROMs don't send DHCP option 97, "Client Machine Identifier
(GUID)", in their DHCP and PXE requests. According to the PXE 2.1
specification and RFC 4578, this makes the requests non-compliant:
> This option MUST be present in all DHCP and PXE packets sent by PXE-compliant clients and servers.
Pixiecore's behavior implements "SHOULD" instead of "MUST": if a
client request has a GUID, Pixiecore's response will respond with a
GUID. If the client request has no GUID, Pixiecore omits option 97 in
its response.
## Development
You can use [Vagrant](https://www.vagrantup.com/) to quickly setup a test environment:
(HOST)$ vagrant up --provider=libvirt pxeserver
(HOST)$ vagrant ssh pxeserver
(PXESERVER)$ wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz
(PXESERVER)$ wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz
(PXESERVER)$ pixiecore -debug -kernel coreos_production_pxe.vmlinuz -initrd coreos_production_pxe_image.cpio.gz --cmdline coreos.autologin
### In another terminal
(HOST)$ vagrant up --provider=libvirt pxeclient1