Fix chunk format documentation for varint encoding

While preparing previous commits, we identified an inconsistency in the
chunk format documentation. The `varint` encoding can require up to 10
bytes for a 64-bit integer, such as when timestamps are encoded.
However, the chunk length field is a 32-bit integer, which requires at
most 5 bytes in `varint` encoding.

This is reflected in the code, where a maximum of 5 bytes are read when
parsing the chunk length.

    50ba25f273/tsdb/chunks/chunks.go (L709-L711)

Co-authored-by: Istvan Zoltan Ballok <istvan.zoltan.ballok@sap.com>
Signed-off-by: Victor Herrero Otal <victor.herrero.otal@sap.com>
This commit is contained in:
Victor Herrero Otal 2025-06-06 14:39:36 +02:00
parent ea87698892
commit fa8f051d06

View File

@ -35,7 +35,7 @@ in-file offset (lower 4 bytes) and segment sequence number (upper 4 bytes).
Notes: Notes:
* `len`: Chunk size in bytes. 1 to 10 bytes long using the [`<uvarint>` encoding](https://go.dev/src/encoding/binary/varint.go). * `len`: Chunk size in bytes. 1 to 5 bytes long using the [`<uvarint>` encoding](https://go.dev/src/encoding/binary/varint.go).
* `encoding`: Currently either `XOR`, `histogram`, or `floathistogram`, see [code for numerical values](https://github.com/prometheus/prometheus/blob/02d0de9987ad99dee5de21853715954fadb3239f/tsdb/chunkenc/chunk.go#L28-L47). * `encoding`: Currently either `XOR`, `histogram`, or `floathistogram`, see [code for numerical values](https://github.com/prometheus/prometheus/blob/02d0de9987ad99dee5de21853715954fadb3239f/tsdb/chunkenc/chunk.go#L28-L47).
* `data`: See below for each encoding. * `data`: See below for each encoding.
* `checksum`: Checksum of `encoding` and `data`. It's a [cyclic redundancy check](https://en.wikipedia.org/wiki/Cyclic_redundancy_check) with the Castagnoli polynomial, serialised as an unsigned 32 bits big endian number. Can be referred as a `CRC-32C`. * `checksum`: Checksum of `encoding` and `data`. It's a [cyclic redundancy check](https://en.wikipedia.org/wiki/Cyclic_redundancy_check) with the Castagnoli polynomial, serialised as an unsigned 32 bits big endian number. Can be referred as a `CRC-32C`.