On Tue, 30 Dec 2003, Fred Wright wrote:
> > - removing watchdogd
> Since the current watchdog support obviously needs work, that's probably
> not a bad idea, although it doesn't solve the underlying problem of the
> system getting too bogged down to run processes.
Folks, there is _NOTHING_ in that watchdog; it enables a 31 second
hardware-countdown on the CPU by passing a single ioctl() to the kernel
which directly writes the value into a CPU register. Then sleeps 15
seconds (using nanosleep()) and does it again.
If userland does not get enough CPU for 30+ seconds to do just even the
bare basics - more than just the watchdog breaks. Fixing this symptom (of
an overloaded) system by removing the watchdog is only going to unearth
another problem; e.g. syslog, DNS timing probles, DHCP running amoc, huge
listen() queues, mbuf starvation, etc.
Fixing the problem could be
-> not throw unrealistic loads at a 486 soekris; again,
those things are deployed by the many hundreds in
commercial settings - and I've not seen this issue
in real live when they handle multiple 100Mbits on
several T1's. If you can afford more - perhaps an
upgrade is in order :-)
-> make the kernel/hw go faster
-> let some traffic damping/packed dropping kick in
if the machine gets overwelmed.
but shooting the messenger is not going to fix the fundamental issue, just
gets you by until the next symptom is getting too painful. And that one
may be a whole lot harder to debug than an abvious reboot.