DragonFly BSD
DragonFly users List (threaded) for 2005-02
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: em driver - issue #2


From: EM1897@xxxxxxx
Date: Sat, 5 Feb 2005 11:06:52 EST

In a message dated 2/4/05 7:23:54 PM Eastern Standard Time, 
dillon@xxxxxxxxxxxxxxxxxxxx writes:
:Ok, now thats its not crashing anymore, on to the next problem. After
:starting the traffic generator, I now get the expected 
:
:"All mbuf clusters exhausted, please see tuning(7)"
:
:message. However it apparently locks up the controller, or at least
:it doesn't recover as expected. Once I stop the traffic generator,
:other interfaces on the dragonfly machine operate normally, but
:the one that was being pounded cannot receive. If I initiate a 
:ping from the dfly box out of the problem controller, then it starts
:working ok. I also get a
:
:"Limiting ICMP unreach response from 11768 to 200 pps"
:
:which implies that quite a few packets get "stuck" somewhere and
:are all released at once when a transmit is initiated, since
:the traffic had stopped arriving long before.
:
:I'll have a chance tomorrow perhaps to dig a bit deeper, but any ideas
:about what might have been changed to cause this  (as this isnt a 
:problem in freebsd) would be helpful. I can also try it on an fxp
:interface to see if its an em problem or a system-wide problem.

>    This sounds like an ARP issue.  If the machine is being pounded
>    to the point where it cannot ARP IP addresses, then packets will
>    be queued both on the machine awaiting arp or on other machines
>    trying to arp the one being pounded.

I don't think so. Its only being "pounded" for 10 seconds, and it only
displays the "clusters exhausted" message once, which implies
to me that either the receiver is being disabled and not checked
until a transmit is initiated, or that the ring is not being replenished
without a check from transmit or an interrupt. There would need to
be a task to periodically check for such things as ring replenishment
for entries that got zeroed out, as in the case where the receiver is 
disabled and there is nothing to transmit, then nothing would
get done.

Plus the existence of 11768 buffers is too many to be stuck in the
ring (unless you guys use REALLY big rings), so it has to be 
something that gets shut down and not restarted properly.

Another issue is why the mbufs are getting exhausted in the first place,
since this doesn't happen on FreeBSD.



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]