DragonFly kernel List (threaded) for 2005-02
Re: rc and smf
Joerg Sonnenberger wrote:
On Thu, Feb 24, 2005 at 11:39:36AM -0800, Matthew Dillon wrote:
But anyhow, back to service failures... service failures do not always
end in a crash. Take BIND for example. It is far more likely that
BIND's cache will become corrupted then for BIND to actually crash. A
simple 'detect that it died and restart' monitor doesn't help you there.
What you have to do is have a program which actually goes in and uses
the service for real. e.g. for a web server a program which connects
to it every minute and retrieves the most complex CGI'd page it
serves out. That's the sort of monitoring we need... not this simple
it-dies-and-we-restart stuff. Service corruption is the far more likely
scenario these days.
I completely agree. IBM has a nice, extensible monitoring facility for AIX,
basically a combination of sensors and trigger rules. The concept alone
is pretty simple, but that does provide mighty tools.
I'd love to have such a daemon written in a modular way for DragonFly/BSD.
It would be something like SNMP with intelligence.
'checkservice' - in the ports tree for some years, lets us keep an eye
on our server's 'public facing' daemons from other servers (or locally,
but 'Quis Custodiet' etc.).
Not perfect, but extensible by 'plugins'. Can do realistic tests, not
just check that the port is active.
Maybe a start for something more general-pupose?