|
||||||||||
On Tue, 30 Dec 2003, Fred Wright wrote: > > - removing watchdogd > > Since the current watchdog support obviously needs work, that's probably > not a bad idea, although it doesn't solve the underlying problem of the > system getting too bogged down to run processes. Folks, there is _NOTHING_ in that watchdog; it enables a 31 second hardware-countdown on the CPU by passing a single ioctl() to the kernel which directly writes the value into a CPU register. Then sleeps 15 seconds (using nanosleep()) and does it again. If userland does not get enough CPU for 30+ seconds to do just even the bare basics - more than just the watchdog breaks. Fixing this symptom (of an overloaded) system by removing the watchdog is only going to unearth another problem; e.g. syslog, DNS timing probles, DHCP running amoc, huge listen() queues, mbuf starvation, etc. Fixing the problem could be -> not throw unrealistic loads at a 486 soekris; again, those things are deployed by the many hundreds in commercial settings - and I've not seen this issue in real live when they handle multiple 100Mbits on several T1's. If you can afford more - perhaps an upgrade is in order :-) -> make the kernel/hw go faster -> let some traffic damping/packed dropping kick in if the machine gets overwelmed. ... but shooting the messenger is not going to fix the fundamental issue, just gets you by until the next symptom is getting too painful. And that one may be a whole lot harder to debug than an abvious reboot. Dw |