We currently set the working URI to "tftp://${next-server}/" whenever
the value of the next-server setting changes.
Many years ago this was required for the default boot sequence, which
would treat the boot filename as a potentially relative URI. Since
commit 481a217 ("[autoboot] Retain initial-slash (if present) when
constructing TFTP URIs"), the default boot sequence has always
constructed an absolute URI.
There is still a valid use case for setting the default working URI
based on the value of next-server: it allows command sequences such as
dhcp && chain ${filename}
or
set next-server 192.168.0.1
chain myscript.ipxe
to work as expected. Note that since "${filename}" may be a relative
path, it is necessary for the current working URI to be the root of
the TFTP server, i.e. "tftp://${next-server}/", rather than the full
path "tftp://${next-server}/${filename}".
In the case of a UEFI HTTP(S) boot, we already have a working URI set
on entry (to be the URI of the iPXE binary itself). Running "dhcp"
would change this current working URI, which is quite unintuitive.
Similarly, once we start executing an image (e.g. a script), the
current working URI is set to the image's own URI, so that relative
URIs may be used in a script to download files relative to the
location of the script itself. Running "dhcp" within the script may
or may not change the current working URI: it will happen to do so
only if the TFTP server address happens to change. This is also
somewhat unintuitive.
Change the behaviour of the TFTP settings applicator to treat the TFTP
server URI as a fallback, to be used only if nothing else has already
set a current working URI. This is technically a breaking change in
behaviour, but the new behaviour is almost certainly much less
surprising than the existing behaviour. (Scripts that do genuinely
expect to acquire a new TFTP server address can use full URIs of the
form "tftp://${next-server}/...": this is more explicit and will work
on iPXE builds both before and after this change.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
TLS defines a mechanism for gracefully closing a connection via a
closure alert. We currently ignore this alert since it is a warning
rather than an error, and warnings are allowed to be ignored.
In almost all cases, a higher-level protocol such as HTTP will already
give us the information required to know when the connection should be
closed. In the very rare case of an HTTPS server that does not send a
Content-Length header and does not close the TCP connection, only the
closure alert indicates that the whole file has been retrieved.
Handle a received closure alert by gracefully closing the connection.
Reported-by: Tuomo Tanskanen <tuomo.tanskanen@est.tech>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
It is unintuitive to have to include an "ifopen" at the start of an
autoexec.ipxe script. Provide a mechanism for upper-layer drivers to
mark a network device to be opened automatically upon registration,
and do so for the device to which the cached DHCPACK is applied.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
TLS versions 1.2 and earlier define a 4-byte gmt_unix_time field as
part of the 32-byte ClientHello random data block, as a (minimal) form
of protection against a broken random number generator. iPXE has
never set this field to a correct value. Early versions had only
relative timers and so set this field to zero. Commit 5da7123 ("[tls]
Include current time within the client random bytes") did set this
field to the current time, but neglected to use the correct byte
ordering.
TLS version 1.3 (defined in RFC 8446) omits the gmt_unix_time field
completely and just defines the whole 32-byte value as random data.
Simplify the code by using the approach defined in RFC 8446.
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RA contains MTU setting, this is especially needed in some networks
which don't have a a full 1500 MTU link to IPv6 internet. Mostly due
to some providers (such as Microsoft Azure) not having a working pMTUd
setup.
Signed-off-by: Christian I. Nilsson <nikize@gmail.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The original implementation in commit 943b300 ("[syslog] Add basic
support for encrypted syslog via TLS") was based on examples found in
the rsyslog documentation rather than on RFC 5425, and unfortunately
used the default syslog port number 514 rather than the syslog-tls
port number 6514 defined in the RFC.
Extend parsing of the syslog server name to allow for an optional port
number (in the relatively intuitive format "server[:port]"). Retain
the existing (and incorrect) default port number to avoid breaking
backwards compatibility with existing setups.
Reported-by: Christian Nilsson <nikize@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
MS-CHAPv2 and the underlying DES algorithm are cryptographically
obsolete, but still relatively widely used. There is no impact to
UEFI Secure Boot from using these obsolete algorithms: the only
untrusted inputs are the username, password, and received network
packets, and all of these are thoroughly validated before use.
Review these files and mark them as permitted for UEFI Secure Boot.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some past security reviews carried out for UEFI Secure Boot signing
submissions have covered specific drivers or functional areas of iPXE.
Mark all of the files comprising these areas as permitted for UEFI
Secure Boot.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Mark all files used in a standard build of bin-x86_64-efi/snponly.efi
as permitted for UEFI Secure Boot. These files represent the core
functionality of iPXE that is guaranteed to have been included in
every binary that was previously subject to a security review and
signed by Microsoft. It is therefore legitimate to assume that at
least these files have already been reviewed to the required standard
multiple times.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The third-party 802.11 stack and NFS protocol code are known to
include multiple potential vulnerabilities and are explicitly
forbidden from being included in Secure Boot signed builds. This is
currently handled at the per-directory level by defining a list of
source directories (SRCDIRS_INSEC) that are to be excluded from Secure
Boot builds.
Annotate all files in these directories with FILE_SECBOOT() to convey
this information to the new per-file Secure Boot permissibility check,
and remove the old separation between SRCDIRS and SRCDIRS_INSEC.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Treat each delayed transmission as a pending operation, so that the
"sync" command can be used to ensure that all delayed packets have
been transmitted.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Gather some basic statistics on TCP connections to allow out-of-order
packets and duplicate packets to be observed even in non-debug builds.
Report these statistics via the existing "ipstat" command, rather than
introducing a separate "tcpstat" command, on the basis that we do not
need the additional overhead of a separate command.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We will currently enqueue (rather than discard) retransmitted packets
that lie immediately before the current receive window. These packets
will be harmlessly discarded when the receive queue is processed
immediately afterwards, but cause confusion when attempting to debug
TCP performance issues.
Fix by adjusting the comparison so that packets that lie immediately
before the receive window will be discarded immediately and never
enqueued.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a fault-injection mechanism that allows an arbitrary delay
(configured via config/fault.h) to be added to any packets transmitted
via the neighbour resolution mechanism, as a way of reproducing
symptoms that occur only on high-latency connections such as a
satellite uplink.
The neighbour discovery mechanism is not a natural conceptual fit for
this artficial delay, since neighbour discovery has nothing to do with
transmit latency. However, the neighbour discovery mechanism happens
to already include a deferred transmission queue that can be (ab)used
to implement this artifical delay in a minimally intrusive way. In
particular, there is zero code size impact on a standard build with no
artificial delay configured.
Implementing the delay only for packets transmitted via neighbour
resolution has the side effect that broadcast packets (such as DHCP
and ARP) are unaffected. This is likely in practice to produce a
better emulation of a high-latency uplink scenario, where local
network traffic such as DHCP and ARP will complete quickly and only
the subsequent TCP/UDP traffic will experience delays.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Split out the logic for transmitting any deferred packets as a
separate function, as a precursor to supporting the ability to add
deliberate latency to transmitted packets.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the discovery protocol pointer field (rather than the running
state of the discovery timer) to determine whether or not neighbour
discovery is ongoing, as a precursor to allowing the timer to be
(ab)used for adding deliberate latency to transmitted packets.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The API for neighbour_tx() allows for an explicit source link-layer
address, but this will be ignored if the packet is deferred for
transmission after completion of neighbour discovery. The network
device's own link-layer address will always be used when sending
neighbour discovery packets, and when sending any deferred packets
after discovery completes.
All callers pass in the network device's own link-layer address as the
source address anyway, and so this explicit source link-layer address
is never used for any meaningful purpose.
Simplify the neighbour_tx() API by removing the ability to pass in an
explicit source link-layer address.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Once an HTTP download has started (i.e. once all request headers have
been sent), we generally have no more data to transmit. If an HTTP
connection dies silently (e.g. due to a network failure, a NIC driver
bug, or a server crash) then there is no mechanism that will currently
detect this situation by default.
We do send TCP keep-alives (to maintain state in intermediate routers
and firewalls), but we do not attempt to elicit a response from the
server. RFC 9293 explicitly states that the absence of a response to
a TCP keep-alive probe must not be interpreted as indicating a dead
connection, since TCP cannot guarantee reliable delivery of packets
that do not advance the sequence number.
Scripts may use the "--timeout" option to impose an overall time limit
on downloads, but this mechanism is off by default and requires
additional thought and configuration by the user (which goes against
iPXE's general philosophy of being as automatic as possible).
Add an idle connection watchdog timer which will cause the HTTP
download to abort after 120 seconds of inactivity. Activity is
defined as an I/O buffer being delivered to the HTTP transaction's
upstream data transfer interface.
Downloads over HTTPS may experience a substantial delay until the
first recorded activity, since all TLS negotiation (including
cross-chained certificate downloads and OCSP checks) must complete
before any application data can be sent. We choose to not reset the
watchdog timer during TLS negotiation, on the basis that 120 seconds
is already an unreasonably long time for a TLS negotiation to take to
complete. If necessary, resetting the watchdog timer could be
accomplished by having the TLS layer deliver zero-length I/O buffers
(via xfer_seek()) to indicate forward progress being made.
When using PeerDist content encoding, the downloaded content
information is not passed through to the content-decoded interface and
so will not be classed as activity. Any activity in the individual
PeerDist block downloads (either from peers or as range requests from
the origin server) will be classed as activity in the overall
download, since individual block downloads do not buffer data but
instead pass it through directly via the PeerDist download
multiplexer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 7627 states that renegotiation becomes no longer secure under
various circumstances when the non-extended master secret is used.
The description of the precise set of circumstances is spread across
various points within the document and is not entirely clear.
Avoid a superset of the circumstances in which renegotiation
apparently becomes insecure by refusing renegotiation completely
unless the extended master secret is used.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 7627 section 5.3 states that the client must abort the handshake
if the server attempts to resume a session where the master secret
calculation method stored in the session does not match the method
used for the connection being resumed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 7627 defines the Extended Master Secret (EMS) as an alternative
calculation that uses the digest of all handshake messages rather than
just the client and server random bytes.
Add support for negotiating the Extended Master Secret extension and
performing the relevant calculation of the master secret.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The calculation for the extended master secret as defined in RFC 7627
relies upon the digest of all handshake messages up to and including
the Client Key Exchange.
Facilitate this calculation by generating the master secret only after
sending the Client Key Exchange message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for RFC 3442 classless static routes provided via DHCP
option 121.
Originally-implemented-by: Hazel Smith <hazel.smith@leicester.ac.uk>
Originally-implemented-by: Raphael Pour <raphael.pour@hetzner.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the definition of an IPv4 routing table entry to allow for the
expression of non-default gateways for specified off-link subnets, and
of on-link secondary subnets (where we can send directly to the
destination address even though our source address is not within the
subnet).
This more precise definition also allows us to correctly handle
routing in the (uncommon for iPXE) case when multiple network
interfaces are open concurrently and more than one interface has a
default gateway.
The common case of a single IPv4 address/netmask and a default gateway
now results in two routing table entries. To retain backwards
compatibility with existing documentation (and to avoid on-screen
clutter), the "route" command prints default gateways on the same line
as the locally assigned address. There is therefore no change in
output from the "route" command unless explicit additional (off-link
or on-link) routes are present.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The uaccess.h header is no longer required for any code that touches
external ("user") memory, since such memory accesses are now performed
through pointer dereferences. Reduce the number of files including
this header.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Since all data transfer buffer contents are now accessible via direct
pointer dereferences, remove the unnecessary abstractions for read and
write operations and create two new data transfer buffer types: a
fixed-size buffer, and a void buffer that records its size but can
never receive non-zero lengths of data. These replace the custom data
buffer types currently implemented for EFI PXE TFTP downloads and for
block device translations.
A new operation xferbuf_detach() is required to take ownership of the
data accumulated in the data transfer buffer, since we no longer rely
on the existence of an independently owned external data pointer for
data transfer buffers allocated via umalloc().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the block device code by assuming that all read/write buffers
are directly accessible via pointer dereferences.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some network devices (observed with the SNP interface to the wireless
network card on an HP Elitebook 840 G10) will stop working if they are
left for too long without being polled.
Add the concept of an insomniac network device, that must continue to
be polled even when closed.
Note that drivers are already permitted to call netdev_rx() et al even
when closed: this will already be happening for USB devices since
polling operates at the level of the whole USB bus, rather than at the
level of individual USB devices.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The allocation of memory for the certificate chain link may cause the
certificate itself to be freed by the cache discarder, if the only
current reference to the certificate is held by the certificate store
and the system runs out of memory during the call to malloc().
Ensure that this cannot happen by taking out a temporary additional
reference to the certificate within x509_append(), rather than
requiring the caller to do so.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Large transmitted records may arise if we have long client certificate
chains or if a client sends a large block of data (such as a large
HTTP POST payload). Fragment records as needed to comply with the
value that we advertise via the max_fragment_length extension.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC5246 states that "a client MAY send no certificates if it does not
have an appropriate certificate to send in response to the server's
authentication request". This use case may arise when the server is
using optional client certificate verification and iPXE has not been
provided with a client certificate to use.
Treat the absence of a suitable client certificate as a non-fatal
condition and send a Certificate message containing no certificates as
permitted by RFC5246.
Reported-by: Alexandre Ravey <alexandre@voilab.ch>
Originally-implemented-by: Alexandre Ravey <alexandre@voilab.ch>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide a custom xfer_alloc_iob() handler to ensure that transmit I/O
buffers contain sufficient headroom for the TLS record header and
record initialisation vector, and sufficient tailroom for the MAC,
block cipher padding, and authentication tag. This allows us to use
in-place encryption for the actual data within the I/O buffer, which
essentially halves the amount of memory that needs to be allocated for
a TLS data transmission.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Datagram sockets such as UDP, ICMP, and fibre channel tend to provide
a custom xfer_alloc_iob() handler to ensure that transmit I/O buffers
contain sufficient headroom to accommodate any required protocol
headers.
Stream sockets such as TCP and TLS do not typically provide a custom
xfer_alloc_iob() handler at present. The default handler simply calls
alloc_iob(), and so stream socket consumers can therefore get away
with using alloc_iob() rather than xfer_alloc_iob().
Fix the HTTP and ONC RPC protocols to use xfer_alloc_iob() where
relevant, in order to operate correctly if the underlying stream
socket chooses to provide a custom xfer_alloc_iob() handler.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The elliptic curve point representation for the x25519 curve includes
only the X value, since the curve is designed such that the Montgomery
ladder does not need to ever know or calculate a Y value. There is no
curve point format byte: the public key data is simply the X value.
The pre-master secret is also simply the X value of the shared secret
curve point.
The point representation for the NIST curves includes both X and Y
values, and a single curve point format byte that must indicate that
the format is uncompressed. The pre-master secret for the NIST curves
does not include both X and Y values: only the X value is used.
Extend the definition of an elliptic curve to allow the point size to
be specified separately from the key size, and extend the definition
of a TLS named curve to include an optional curve point format byte
and a pre-master secret length.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Split out the portion of tls_send_client_key_exchange_ecdhe() that
actually performs the elliptic curve key exchange into a separate
function ecdhe_key().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Instances of cipher and digest algorithms tend to get called
repeatedly to process substantial amounts of data. This is not true
for public-key algorithms, which tend to get called only once or twice
for a given key.
Simplify the public-key algorithm API so that there is no reusable
algorithm context. In particular, this allows callers to omit the
error handling currently required to handle memory allocation (or key
parsing) errors from pubkey_init(), and to omit the cleanup calls to
pubkey_final().
This change does remove the ability for a caller to distinguish
between a verification failure due to a memory allocation failure and
a verification failure due to a bad signature. This difference is not
material in practice: in both cases, for whatever reason, the caller
was unable to verify the signature and so cannot proceed further, and
the cause of the error will be visible to the user via the return
status code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TLS connection structure has grown to become unmanageably large as
new features and support for new TLS protocol versions have been added
over time.
Split out the portions of struct tls_connection that are specific to
client and server operations into separate structures, and simplify
some structure field names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TLS connection structure has grown to become unmanageably large as
new features and support for new TLS protocol versions have been added
over time.
Split out the portions of struct tls_connection that are specific to
transmit and receive operations into separate structures, and simplify
some structure field names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Asymmetric keys are invariably encountered within ASN.1 structures
such as X.509 certificates, and the various large integers within an
RSA key are themselves encoded using ASN.1.
Simplify all code handling asymmetric keys by passing keys as a single
ASN.1 cursor, rather than separate data and length pointers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An attempt to use a validator for an empty certificate chain will
correctly fail the overall validation with the "empty certificate
chain" error propagated from x509_auto_append().
In a debug build, the call to validator_name() will attempt to call
x509_name() on a non-existent certificate, resulting in garbage in the
debug message.
Fix by checking for the special case of an empty certificate chain.
This issue does not affect non-debug builds, since validator_name() is
(as per its description) called only for debug messages.
Signed-off-by: Michael Brown <mcb30@ipxe.org>