DragonFly kernel List (threaded) for 2009-03
Add Toeplitz hash function to map packet's CPU
After making two NICs multiple receive queue working, I now propose to
add Toeplitz hash function to map packet's CPU. It is mainly use to
support "receive side scaling"
hardware. To make the whole story short: the hardware will calculate
the hash when receiving a packet, and put the packet to the proper RX
queue along with the calculated hash, which means we don't need to
calculate the hash ourselves and the input processing could be fully
parallelized (if the multiple TX queue support is added then the whole
forwarding path is even CPU localized). If the packet is non-fragment
TCP, the hash is calculated based on laddr,faddr,lport,fport, else the
hash is calculated using laddr,faddr.
There are two things we need to overcome:
1) The result of hash function is non-commutative in the M$ paper,
i.e. faddr,laddr,fport,lport and laddr,faddr,lport,fport gives
Thanks to corecode's suggestion, as long as 0xabcd is replicated to
form the key, the result of the hash function _is_ commutative.
2) It is computational heavy
Thanks to corecode again, we could cache a pre-calculated result
table, so we actually only need to index a array and OR the results.
A simple implementation is at:
I used it to verify that hardware gives the correct result :)
The whole thing is not implemented yet, but if you don't think its bad
idea, I will move on to implement it. Note, it is not intended to
replace the current packet hash function.
Live Free or Die