DragonFly BSD
DragonFly users List (threaded) for 2006-08
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: Postfix suddenly stopped working


From: Jon Hamilton <hamilton@xxxxxxxxx>
Date: Sat, 12 Aug 2006 15:40:58 -0500

Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>, said on Thu Jul 27, 2006 [07:14:53 PM]:
} 
}     With postfix stuck, do:
} 
}     /usr/local/bin/vnodeinfo -a > /tmp/outfile
} 
}     Then look for vnode information structures containing LOCKS or BLKED
}     entries that might be related to the problem.
} 
}     If there are no blocked locks then it could be a race in our POSIX
}     locking sleep/wakeup code.

I ran with -HEAD built from a couple of weeks ago and did not see a
reoccurrance of the postfix queue "sticking".  Last night, I went back
to 

DragonFly woodstock.nethamilton.net 1.7.0-PREVIEW DragonFly 1.7.0-PREVIEW #6: Sat Aug 12 12:07:04 CDT 2006     hamilton@xxxxxxxxxxxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/WOODSTOCK  i386

with a build/installkernel and installworld.

To my surprise, the symptom popped back up this morning.  I checked the source,
and found that the patch above hadn't been applied.  I applied the patch and 
rebuilt and installed the kernel, and the queue got stuck again this 
afternoon.

I ran vnodeinfo as above, and after ripping out the non-locked stuff from
the output the results are at http://www.nethamilton.net/lock_debug/stuck1.txt
(which is pre-patch) and http://www.nethamilton.net/lock_debug/stuck2.txt 
(post-patch).  I'm not sure what this is trying to tell me aside from 
confirming that postfix is holding a lock on unix.local.  

A couple of questions:
1) is this a different problem, since it's occurring even after I applied 
   the patch?
2) what can I do to diagnose further?  

I'm happy to fiddle around to gather info on this, but need a little 
hand holding in terms of exactly what to do.  

-- 

   Jon Hamilton 
   hamilton@xxxxxxxxx



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]