On 24.12.2003, at 10:31, Justin Albstmeijer wrote:
> Killing the watchdogd, does not solve the reboot problem.
Uhoh, Justin, unless there's another problem I haven't seen yet, you
owe me one now. ;) Killing watchdogd does indeed solve the problem - of
course I did not try this at first after you said it didn't help. Are
you really sure you actually killed the watchdogd process? It makes
perfect sense to me now: at 100.0% CPU load (mostly interrupts), that
poor watchdogd process starves and doesn't get to tickle the watchdog
again. After about 30 seconds, the watchdog timer fires, the CPU is
reset - bingo. Also, now I know why I didn't see this problem back when
I was doing some serious throughput testing with net45xx's and
m0n0wall: there was no watchdogd in m0n0wall at that time!
Since running watchdogd at an increased (or even realtime) priority
doesn't help here (I tried it, too), I'm wondering what net45xx
m0n0wall users think we should do... I see two solutions:
- removing watchdogd
- using polling by default on net45xx
I'm not sure if the latter is such a good idea (even though polling
yields a slightly higher throughput), because wireless cards can still
cause the same problem.
If no better solution is proposed, I'll just remove watchdogd from the
next release. I'm not entirely sure it's needed anyway... has anybody's
net45xx ever hung up for no reason? Mine certainly haven't.
- Manuel |