By adding a "landing area" at the end of the buffer, it becomes safe to
parse more bytes at once. On 32-bit this makes fgets run about 4% faster
but it does not save anything on 64-bit.
A bug in the algorithm used to find an LF in multiple bytes at once
made byte 0x80 trigger detection of byte 0x00, thus 0x8A matches byte
0x0A. In practice, this issue never happens since byte 0x8A won't be
displayed in logs (or it will be encoded). This could still possibly
happen in mixed logs.
A new idea came up to detect the presence of a null byte in a word.
It saves several operations compared to the previous one, and eliminates
the jumps (about 6 instructions which can run 2-by-2 in parallel).
This sole optimisation improved the line count speed by about 30%.