DragonFly users List (threaded) for 2005-02
Re: em driver - issue #2
In a message dated 2/6/2005 1:26:04 PM Eastern Standard Time, Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx> writes:
>:Ok, I have some more info on this that may help to sort it out.
>:I set the m_getcl() in em_get_buf to MB_WAIT and the symptoms are
>:interesting. After about 5 seconds, the mbuf exhausted message
>:appears again. But, it doesn't lock up. For a short time the NIC is
>:issuing flow controls (as the transmit stream is being held up to a
>:lower than optimal volume), after awhile the box is able to handle
>:the full stream normally, showing ~144000 received packets in the
>:ICMP unreachable message, as is expected.
>:I assume this is some sort of memory cache that gets built up over
>:time. Something that is pre-allocated in FreeBSD but not in Dfly?
>:It appears that the lockup is a bug in the em driver; one that perhaps
>:just doesnt happen often enough for anyone to have gone to the trouble
>:of tracking it down. But its exasperated by the mbuf problem, so if that
>:can be cleared up it should be "good enough", at least for the time being.
> My guess is that what is going on is that the EM device is unable
> to allocate a buffer to the receive ring. This creates 'holes'
> in the receive ring. The EM device's RX interrupt is stopping either
> when it hits a hole, or if there are no good descriptors left in the
> receive ring. The restart only occurs on the next em_poll() or
> em_intr() which in this case would be a transmit interrupt.
> The best solution is probably to create a 'dummy' mbuf to act as filler
> when the device is unable to allocate a new one, and then ignore such
> mbufs (drop the related packets) when they are encountered.
I think there are a couple of things wrong with that solution.
First, controllers know what to do with empty descriptors, in that they fall into a RNR condition. Thats part of the basic design. Its the drivers responsibility to clear up such conditions. At 145Kpps, you're not going to achieve much by trying to fool the driver into thinking that it has memory, except losing a lot of packets. The point of the RNR condition is to get the other end to stop sending until you can handle it. The driver is doing something wrong in this case, and it needs to be cle
aned up properly.
The second thing thats wrong is that the "problem" is that the memory MUST be available. That has to be corrected. Its not acceptable for it to fail the way its failing. There's no excuse for a system with 20K clusters supposedly allocated to not be able to get the 1600th cluster because of a "bucket" problem. The reason that many drivers don't handle the "cant get memory" condition is because it almost never happens in real world scenarios. Its a serious problem that it happens so quickly. 100
0 packets at gigabit speeds is a tiny amount of time. It makes little sense to redesign the mbuf system only to leave it with such an inefficiency. I don't know enough about it to know how other O/Ss do it, but they don't fail the way the dfly does in this instance.