Replace the `io.Pipe` from streamingBitrotWriter -> CreateFile with a fixed size ring buffer. This will add an output buffer for encoded shards to be written to disk - potentially via RPC. This will remove blocking when `(*streamingBitrotWriter).Write` is called, and it writes hashes and data. With current settings, the write looks like this: ``` Outbound ┌───────────────────┐ ┌────────────────┐ ┌───────────────┐ ┌────────────────┐ │ │ Parr. │ │ (http body) │ │ │ │ │ Bitrot Hash │ Write │ Pipe │ Read │ HTTP buffer │ Write (syscall) │ TCP Buffer │ │ Erasure Shard │ ──────────► │ (unbuffered) │ ────────────► │ (64K Max) │ ───────────────────► │ (4MB) │ │ │ │ │ │ (io.Copy) │ │ │ └───────────────────┘ └────────────────┘ └───────────────┘ └────────────────┘ ``` We write a Hash (32 bytes). Since the pipe is unbuffered, it will block until the 32 bytes have been delivered to the TCP buffer, and the next Read hits the Pipe. Then we write the shard data. This will typically be bigger than 64KB, so it will block until two blocks have been read from the pipe. When we insert a ring buffer: ``` Outbound ┌───────────────────┐ ┌────────────────┐ ┌───────────────┐ ┌────────────────┐ │ │ │ │ (http body) │ │ │ │ │ Bitrot Hash │ Write │ Ring Buffer │ Read │ HTTP buffer │ Write (syscall) │ TCP Buffer │ │ Erasure Shard │ ──────────► │ (2MB) │ ────────────► │ (64K Max) │ ───────────────────► │ (4MB) │ │ │ │ │ │ (io.Copy) │ │ │ └───────────────────┘ └────────────────┘ └───────────────┘ └────────────────┘ ``` The hash+shard will fit within the ring buffer, so writes will not block - but will complete after a memcopy. Reads can fill the 64KB buffer if there is data for it. If the network is congested, the ring buffer will become filled, and all syscalls will be on full buffers. Only when the ring buffer is filled will erasure coding start blocking. Since there is always "space" to write output data, we remove the parallel writing since we are always writing to memory now, and the goroutine synchronization overhead probably not worth taking. If the output were blocked in the existing, we would still wait for it to unblock in parallel write, so it would make no difference there - except now the ring buffer smoothes out the load. There are some micro-optimizations we could look at later. The biggest is that, in most cases, we could encode directly to the ring buffer - if we are not at a boundary. Also, "force filling" the Read requests (i.e., blocking until a full read can be completed) could be investigated and maybe allow concurrent memory on read and write.
ringbuffer
A circular buffer (ring buffer) in Go, implemented io.ReaderWriter interface
Usage
package main
import (
	"fmt"
	"github.com/smallnest/ringbuffer"
)
func main() {
	rb := ringbuffer.New(1024)
	// write
	rb.Write([]byte("abcd"))
	fmt.Println(rb.Length())
	fmt.Println(rb.Free())
	// read
	buf := make([]byte, 4)
	rb.Read(buf)
	fmt.Println(string(buf))
}
It is possible to use an existing buffer with by replacing New with NewBuffer.
Blocking vs Non-blocking
The default behavior of the ring buffer is non-blocking, meaning that reads and writes will return immediately with an error if the operation cannot be completed. If you want to block when reading or writing, you must enable it:
	rb := ringbuffer.New(1024).SetBlocking(true)
Enabling blocking will cause the ring buffer to behave like a buffered io.Pipe.
Regular Reads will block until data is available, but not wait for a full buffer. Writes will block until there is space available and writes bigger than the buffer will wait for reads to make space.
TryRead and TryWrite are still available for non-blocking reads and writes.
To signify the end of the stream, close the ring buffer from the writer side with rb.CloseWriter()
Either side can use rb.CloseWithError(err error) to signal an error and close the ring buffer.
Any reads or writes will return the error on next call.
In blocking mode errors are stateful and the same error will be returned until rb.Reset() is called.

