mirror of
				https://git.haproxy.org/git/haproxy.git/
				synced 2025-10-31 08:30:59 +01:00 
			
		
		
		
	
		
			
				
	
	
		
			125 lines
		
	
	
		
			4.3 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			125 lines
		
	
	
		
			4.3 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 2007/03/30 - Header storage in trees
 | |
| 
 | |
| This documentation describes how to store headers in radix trees, providing
 | |
| fast access to any known position, while retaining the ability to grow/reduce
 | |
| any arbitrary header without having to recompute all positions.
 | |
| 
 | |
| Principle :
 | |
|   We have a radix tree represented in an integer array, which represents the
 | |
|   total number of bytes used by all headers whose position is below it. This
 | |
|   ensures that we can compute any header's position in O(log(N)) where N is
 | |
|   the number of headers.
 | |
| 
 | |
| Example with N=16 :
 | |
| 
 | |
|    +-----------------------+
 | |
|    |                       |
 | |
|    +-----------+           +-----------+
 | |
|    |           |           |           |
 | |
|    +-----+     +-----+     +-----+     +-----+
 | |
|    |     |     |     |     |     |     |     |
 | |
|    +--+  +--+  +--+  +--+  +--+  +--+  +--+  +--+
 | |
|    |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
 | |
| 
 | |
|    0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
 | |
| 
 | |
|    To reach header 6, we have to compute hdr[0]+hdr[4]+hdr[6]
 | |
| 
 | |
|    With this method, it becomes easy to grow any header and update the array.
 | |
|    To achieve this, we have to replace one after the other all bits on the
 | |
|    right with one 1 followed by zeroes, and update the position if it's higher
 | |
|    than current position, and stop when it's above number of stored headers.
 | |
| 
 | |
|    For instance, if we want to grow hdr[6], we proceed like this :
 | |
| 
 | |
|    6 = 0110 (BIN)
 | |
| 
 | |
|    Let's consider the values to update :
 | |
| 
 | |
|    (bit 0) : (0110 & ~0001) | 0001 = 0111 = 7 >  6 => update
 | |
|    (bit 1) : (0110 & ~0011) | 0010 = 0110 = 6 <= 6 => leave it
 | |
|    (bit 2) : (0110 & ~0111) | 0100 = 0100 = 4 <= 6 => leave it
 | |
|    (bit 4) : (0110 & ~1111) | 1000 = 1000 = 8 >  6 => update
 | |
|    (bit 5) : larger than array size, stop.
 | |
| 
 | |
| 
 | |
| It's easy to walk through the tree too. We only have one iteration per bit
 | |
| changing from X to the ancestor, and one per bit from the ancestor to Y.
 | |
| The ancestor is found while walking. To go from X to Y :
 | |
| 
 | |
|    pos = pos(X)
 | |
| 
 | |
|    while (Y != X) {
 | |
|      if (Y > X) {
 | |
|        // walk from Y to ancestor
 | |
|        pos += hdr[Y]
 | |
|        Y &= (Y - 1)
 | |
|      } else {
 | |
|        // walk from X to ancestor
 | |
|        pos -= hdr[X]
 | |
|        X &= (X - 1)
 | |
|      }
 | |
|    }
 | |
| 
 | |
| However, it is not trivial anymore to linearly walk the tree. We have to move
 | |
| from a known place to another known place, but a jump to next entry costs the
 | |
| same as a jump to a random place.
 | |
| 
 | |
| Other caveats :
 | |
|   - it is not possible to remove a header, it is only possible to empty it.
 | |
|   - it is not possible to insert a header, as that would imply a renumbering.
 | |
|   => this means that a "defrag" function is required. Headers should preferably
 | |
|      be added, then should be stuffed on top of destroyed ones, then only
 | |
|      inserted if absolutely required.
 | |
| 
 | |
| 
 | |
| When we have this, we can then focus on a 32-bit header descriptor which would
 | |
| look like this :
 | |
| 
 | |
| {
 | |
|   unsigned line_len :13; /* total line length, including CRLF */
 | |
|   unsigned name_len  :6; /* header name length, max 63 chars */
 | |
|   unsigned sp1       :5; /* max spaces before value : 31 */
 | |
|   unsigned sp2       :8; /* max spaces after value : 255 */
 | |
| }
 | |
| 
 | |
| Example :
 | |
| 
 | |
|   Connection:      close           \r\n
 | |
|   <---------+-----+-----+-------------> line_len
 | |
|   <-------->|     |     |               name_len
 | |
|             <----->     |               sp1
 | |
|                         <-------------> sp2
 | |
| Rem:
 | |
|   - if there are more than 31 spaces before the value, the buffer will have to
 | |
|     be moved before being registered
 | |
| 
 | |
|   - if there are more than 255  spaces after the value, the buffer will have to
 | |
|     be moved before being registered
 | |
| 
 | |
|   - we can use the empty header name as an indicator for a deleted header
 | |
| 
 | |
|   - it would be wise to format a new request before sending lots of random
 | |
|     spaces to the servers.
 | |
| 
 | |
|   - normal clients do not send such crap, so those operations *may* reasonably
 | |
|     be more expensive than the rest provided that other ones are very fast.
 | |
| 
 | |
| It would be handy to have the following macros :
 | |
| 
 | |
|   hdr_eon(hdr)  => end of name
 | |
|   hdr_sov(hdr)  => start of value
 | |
|   hdr_eof(hdr)  => end of value
 | |
|   hdr_vlen(hdr) => length of value
 | |
|   hdr_hlen(hdr) => total header length
 | |
| 
 | |
| 
 | |
| A 48-bit encoding would look like this :
 | |
| 
 | |
|   Connection:      close           \r\n
 | |
|   <---------+------+---+--------------> eoh = 16 bits
 | |
|   <-------->|      |   |                eon = 8 bits
 | |
|   <--------------->|   |                sov = 8 bits
 | |
|                    <--->                vlen = 16 bits
 | |
| 
 |