DragonFly BSD
DragonFly users List (threaded) for 2005-02
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

interrupt routing problem?


From: EM1897@xxxxxxx
Date: Wed, 9 Feb 2005 12:46:03 EST

>We have a MB that is problematic in FreeBSD that I wanted to test
>in Dragonfly. When routing a decent volume of traffic through the box
>you get a device timeout pretty quickly, and I think its due to interrupts 
>getting mucked up somehow. My only evidence is that the MB
>runs without any problems in DEVICE POLLING mode. Anyway, the 
>same problem occurs  in Dragonfly, but I wasn't getting the "device 
>timeout" error that I expected; the box was just locking up with the 
>same symptoms as it did under freeBSD. I 
>noticed that you changed the em_watchdog routine so that the
>message doesn't get displayed unless a link is brought up successfully,
>but then you go and reset the controller anyway.
>
>I don't see that whats been done is correct. First of all you WANT to 
>see the message, otherwise you have no idea that anything is wrong. 
>If the controller is flapping through the watchdog routine its not something
>you want to go unnoticed. I also think the logic behind the change is 
faulty, 
>as the assumption that if the link is up, all is well (in which case why are
>you resetting the controller anyway?). In this case, the transmitter is 
>locked up
>with the link up.
>
>BTW, none of this corrects the problem, as the controller stays locked
>up. The only thing found to always work to fix it is:
>
>ifconfig em0 down
>ifconfig em1 down
>ifconfig em0 up
>ifconfig em1 up

Matt, do you have anything that I can look at to see what might be wrong
with the MB? I never got anyone in FreeBSD to give a hoot about it. The
info I have is:

- at high speeds, the em transmit interface gets locked. Since this never
happens in device_polling mode, my assumption is that the interrupts
aren't working properly
- There are 2 on-board NICs and 2 NICS in a PCI-X slot. When passing
data through the 2 PCI-X slots, the lockup occurs within seconds. When
using the onboard NICs, it takes a long time, perhaps an hour, before
a problem occurs. The difference between the on-board NICs and
the PCI-X nics are that the onboard NICs are running in 32bit/33Mhz
mode while the PCI-X NICs are running 64bit/66Mhz mode.
- all NICs are em driver.

I have to try linux on this machine. Its a supermicro MB and they
claim to have no info on problems with the hardware.



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]