[ previous ] [ next ] [ threads ]
 From:  "Manuel Kasper" <mk at neon1 dot net>
 To:  <bzoller at cavokintl dot com>
 Cc:  <list at m0n0wall dot neon1 dot net>
 Subject:  Re: [m0n0wall] Script to watch DHCPD
 Date:  Wed, 26 Mar 2003 19:44:22 +0100 (CET)
Hi Bob,

> I've been using m0n0wall for a couple weeks now, and truthfully it rocks
> :)  The only issue I've had is the dhcp server dies occasionally.
> Sometimes twice within one day, sometimes only once during a whole
> week..  In any case, I've written a script to check on the dhcpd process
> every minute and reset it if it has died.

Strange, but actually it does not surprise me. On my "production"
m0n0wall, MPD (which I use for PPPoE on an ADSL link) dies every few
(about 4-5) days with no messages whatsoever. I have no idea at all what
could cause this, but you (and all the other m0n0wall users) should be
aware of one unresolved mystery:

m0n0wall used to run with the standard FreeBSD userland ppp. It all worked
great in the shell script configuration (never released) and I had uptimes
of several weeks. Things changed all of a sudden when I moved to PHP
configuration - ppp would die soon (after 5 MB or about 30 minutes)
without any debug output at all. I checked everything, remade the whole
system (including make world and kernel etc.), and it boiled down to a
simple fact:

- ppp exec'd from /etc/rc shell script: no problems
- ppp exec'd from PHP: dies after about half an hour

I have investigated pretty much every possible reason for this behavior,
including various ways of exec'ing programs in the background from within
PHP, using a myriad of tools like 'daemon', exec'ing a shell script from
PHP which in turn exec'd ppp - it didn't matter; if PHP was involved, ppp
die soon. I also checked for any rlimits imposed on ppp - there were none.

I have also posted to freebsd-questions about this, but nobody seemed to
know an answer. I have tried on different net45xx's - always the same

At some point before pb1 was released, I noticed that MPD would be a much
better alternative (much faster due to kernel data processing, etc.), so I
switched over to MPD and didn't see the problem again. But maybe this has
just stretched the mysterious time frame from 30 minutes to 5 days? I
don't know, as all other services (including DHCP) run indefinitely
without problems on my m0n0wall.

Any insight on this matter would be greatly appreciated. If I can't find
the reason soon, I'll just set up a standard PC with m0n0wall software and
see if the problem is related to the Soekris hardware (which I don't
hope/believe). And if that doesn't yield any interesting facts either,
we'll just have to use respawn scripts like the one you proposed (thank
you!) - I'd prefer not to have to resort to that kind of solution, though.