known_addrs in PeerInfoInternal is append-only — addresses accumulate
via add_addr() and PeerList gossip but are never removed. In dynamic
environments (k8s pod restarts, DHCP, NAT traversal), this list grows
unboundedly with stale addresses.
Combined with sequential iteration in try_connect() and no TCP connect
timeout in netapp.rs, each unreachable address blocks reconnection for
the kernel's TCP SYN timeout (75-130s on Linux). With 10+ stale
addresses, worst-case reconnection exceeds 750s — a full outage for
replication_factor=3 clusters.
This commit contains the two following changes:
1. Address failure tracking and pruning (peering.rs): Track consecutive
connection failures per address in PeerInfoInternal. After 3 failures,
prune from known_addrs. Reset count when address is re-advertised via
gossip or incoming connection. Prevents unbounded list growth.
2. Shuffle before connecting (peering.rs): Randomize address order in
try_connect() so the valid address (often appended last) gets a fair
chance instead of always trying stale addresses first.
known_addrs in PeerInfoInternal is append-only — addresses accumulate
via add_addr() and PeerList gossip but are never removed. In dynamic
environments (k8s pod restarts, DHCP, NAT traversal), this list grows
unboundedly with stale addresses.
Combined with sequential iteration in try_connect() and no TCP connect
timeout in netapp.rs, each unreachable address blocks reconnection for
the kernel's TCP SYN timeout (75-130s on Linux). With 10+ stale
addresses, worst-case reconnection exceeds 750s — a full outage for
replication_factor=3 clusters.
This patches includes a first change to fix this issue:
1. TCP connect timeout (netapp.rs): Wrap TcpStream::connect() in
tokio::time::timeout(10s). Caps per-address attempt from 75-130s
to 10s, reducing worst-case 10-addr reconnection from ~750s to ~100s.
## Summary
This PR ensures that the `LifecycleWorker` yields at least once to the Tokio scheduler in between each batch of 100 objects.
## Problem being solved
I'm administrating a Garage cluster which has been experiencing timeouts on all endpoints while the lifecycle worker is running at midnight UTC : `Ping timeout` error messages and even requests eventually failing due to `Could not reach quorum ...`.
I have found that this happens while the lifecycle worker is working on a big bucket (containing millions of objects) with a lifecycle rule that applies to very few objects.
The `process_object()` function does not hit any `await`:
- `last_bucket` is always the same, so the `bucket_table` is not read asynchronously
- no transaction is made on the `object_table` because my lifecycle rule (almost) never applies to any object
The first commit in this PR adds an executable which reproduces the problem that I've been experiencing in a self-contained way : the lifecycle worker starves the Tokio scheduler so much that no other task is able to run (or very rarely).
To run it : `cargo run -p garage_model --bin lifecycle-starvation-test`.
This commit can be dropped post-review, as it's only useful to demonstrate the starvation.
The error messages completely stopped after adding the extra yield to the nodes of my cluster.
The duration of the lifecycle worker task does not appear to have changed at all from what I can see (looking at the timestamps produced either by the self-contained binary or by each of my nodes with the `Lifecycle worker finished` message).
## Note
An other potential fix would have been to force the `WorkerProcessor` to yield before re-enqueuing a busy task, but this would have affected all Garage workers even though it's only the `LifecycleWorker` being uncooperative.
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1396
Reviewed-by: Alex <lx@deuxfleurs.fr>
Co-authored-by: Gauthier Zirnhelt <gauthier.zirnhelt@insimo.fr>
Co-committed-by: Gauthier Zirnhelt <gauthier.zirnhelt@insimo.fr>
This fixes a regression wrt garage-v1, likely caused by the version upgrade of quick_xml.
Currently, garage-v2 will emit empty ErrorDocument/IndexDocument/RedirectAllRequestsTo attributes in the response of GetBucketWebsite if there are no corresponding values.
This is somewhat wrong; at least, the S3 documentation for RedirectAllRequestsTo (https://docs.aws.amazon.com/AmazonS3/latest/API/API_RedirectAllRequestsTo.html) writes that it has a required HostName field. So emitting an empty RedirectAllRequestsTo is invalid.
This PR skips emitting XML attributes for these parameters if they contain no value.
Co-authored-by: Armaël Guéneau <armael.gueneau@ens-lyon.org>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1391
Co-authored-by: Armael <armael@noreply.localhost>
Co-committed-by: Armael <armael@noreply.localhost>
This is a port of #1320 on top of the main-v2 branch.
Co-authored-by: Armaël Guéneau <armael.gueneau@ens-lyon.org>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1392
Co-authored-by: Armael <armael@noreply.localhost>
Co-committed-by: Armael <armael@noreply.localhost>
this makes it more easy to correlate an error with the request that caused it. This can be helpful during debugging, or when setting up some sort of automation based on log content
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1390
Reviewed-by: Alex <lx@deuxfleurs.fr>
Reviewed-by: maximilien <git@mricher.fr>
Co-authored-by: trinity-1686a <trinity@deuxfleurs.fr>
Co-committed-by: trinity-1686a <trinity@deuxfleurs.fr>
Made a quick pr to add a sub-command called completions for generating shell completions, was going pretty crazy that this wasn't a thing :P.
Tried my best to do everything properly, let me know if I need to change something, I tested it and it works perfectly.
Co-authored-by: MrSnowy <snow@mrsnowy.dev>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1386
Reviewed-by: Alex <lx@deuxfleurs.fr>
Co-authored-by: MrSnowy <mrsnowy@noreply.localhost>
Co-committed-by: MrSnowy <mrsnowy@noreply.localhost>
## Summary
This PR fixes S3 `DeleteObjects` XML parsing when the request body is pretty-printed (contains indentation/newlines as whitespace text nodes).
Although PR #1324 already tried to address this, parsing could still fail with:
`InvalidRequest: Bad request: Invalid delete XML query`
because non-element nodes were validated but not actually skipped in the parsing loop.
## What changed
- In `src/api/s3/delete.rs`:
- Properly skip non-element whitespace text nodes while iterating over `<Delete>` children.
- Keep rejecting non-whitespace stray text content.
- Parse the root `<Delete>` element more robustly by selecting the first element child.
## Tests added
New unit tests in `src/api/s3/delete.rs`:
- `parse_delete_objects_xml_with_formatting`
- pretty-printed valid XML is accepted.
- `parse_delete_objects_xml_accepts_compact_valid_xml`
- compact valid XML is accepted.
- `parse_delete_objects_xml_rejects_non_whitespace_text_node`
- compact XML with stray text is rejected.
- `parse_delete_objects_xml_rejects_pretty_print_with_stray_text`
- pretty-printed XML with stray text is rejected.
## Validation
Executed:
```bash
cargo test -p garage_api_s3 parse_delete_objects_xml -- --nocapture
```
Result: all parser tests pass.
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1374
Co-authored-by: milouz1985 <francois.hoyez@gmail.com>
Co-committed-by: milouz1985 <francois.hoyez@gmail.com>
## Problem
`hugo deploy` is broken with Garage on recent hugo versions when using gzip matchers
## Why?
We don't support multi-value headers correctly, in this case this specific headers combination:
```
Content-Encoding: gzip
Content-Encoding: aws-chunked
```
is interpreted as:
```
Content-Encoding: gzip
```
instead of:
```
Content-Encoding: gzip,aws-chunked
```
It fails both 1. the signature check and 2. the streaming check.
## Proposed fix
- Taking into account multi-value headers when building Canonical Request (validated with hugo deploy + AWS SDK v2)
- Taking into account multi-value headers (both comma separated and HeaderEntry separated) when removing `aws-chunked` (validated with hugo deploy + AWS SDK v2)
## Full explanation
Currently, `hugo deploy` on version `hugo v0.152.2` or more recent uses AWS SDK v2 only and supports for sending gzipped content.
That's configured with a matcher like that:
```yaml
deployment:
matchers:
- pattern: "^.+\\.(woff2|woff|svg|ttf|otf|eot|js|css)$"
cacheControl: "max-age=31536000, no-transform, public"
gzip: true # <-------- here
```
Also, with SDK v2, hugo is streaming all of its files.
Thus, it sends that kind of requests:
```python
Request {
method: PUT,
uri: /sebou/pagefind/pagefind.js?x-id=PutObject,
version: HTTP/1.1,
headers: {
"host": "localhost",
"user-agent": "aws-sdk-go-v2/1.39.2 ua/2.1 os/linux lang/go#1.25.6 md/GOOS#linux md/GOARCH#amd64 api/s3#1.84.0 ft/s3-transfer m/E,G,Z,g",
"content-length": "10026",
"accept-encoding": "identity",
"amz-sdk-invocation-id": "aed6df34-a67c-4bab-b63b-2b3777b751a0",
"amz-sdk-request": "attempt=1; max=3",
"authorization": "AWS4-HMAC-SHA256 Credential=GKxxxxx/20260227/garage/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;cache-control;content-encoding;content-length;content-type;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-meta-md5chksum;x-amz-trailer, Signature=76cd9b77f693ca89c2e6dd2a4dc55f83d4a82eca0f563d9d095ff96076f7b057",
"cache-control": "max-age=31536000, no-transform, public",
"content-encoding": "gzip", # <---- see here 1st instance of Content-Encoding
"content-encoding": "aws-chunked", # <---- 2nd instance of Content-Encoding
"content-type": "text/javascript",
"via": "2.0 Caddy",
"x-amz-content-sha256": "STREAMING-UNSIGNED-PAYLOAD-TRAILER",
"x-amz-date": "20260227T132212Z",
"x-amz-decoded-content-length": "9982",
"x-amz-meta-md5chksum": "aad88ac0bf704e91584b8d9ad9796670",
"x-amz-trailer": "x-amz-checksum-crc32",
"x-forwarded-for": "::1",
"x-forwarded-host": "localhost",
"x-forwarded-proto": "https"
},
body: Body(Streaming)
}
```
But our canonical request function only calls `HeaderMap.get()` that returns only the 1st value and not `HeaderMap.get_all()` that returns all the values for a header.
Leading to the following invalid `CanonicalRequest` value:
```python
PUT
/sebou/pagefind/pagefind.js
x-id=PutObject
accept-encoding:identity
amz-sdk-invocation-id:aed6df34-a67c-4bab-b63b-2b3777b751a0
amz-sdk-request:attempt=1; max=3
cache-control:max-age=31536000, no-transform, public
content-encoding:gzip # <----- see here, we kept only gzip and dropped aws-chunked
content-length:10026
content-type:text/javascript
host:localhost
x-amz-content-sha256:STREAMING-UNSIGNED-PAYLOAD-TRAILER
x-amz-date:20260227T132212Z
x-amz-decoded-content-length:9982
x-amz-meta-md5chksum:aad88ac0bf704e91584b8d9ad9796670
x-amz-trailer:x-amz-checksum-crc32
accept-encoding;amz-sdk-invocation-id;amz-sdk-request;cache-control;content-encoding;content-length;content-type;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-meta-md5chksum;x-amz-trailer
```
Amazon is crystal clear that, instead of dropping the other values, we should concatenate them with a comma:

https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_sigv-create-signed-request.html#create-canonical-request
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/1369
Reviewed-by: Alex <lx@deuxfleurs.fr>
Co-authored-by: Quentin Dufour <quentin@deuxfleurs.fr>
Co-committed-by: Quentin Dufour <quentin@deuxfleurs.fr>