Manuel Kasper wrote:
> On 16.05.2008, at 20:01, mtnbkr wrote:
>> 3648 active
> I can't really see anything out of the ordinary in your vmstat/ipfstat
> output; however, given the strict ruleset that you described, I find
> 3648 active connections quite a lot. You could try "ipfstat -ls" to see
> the full list of active connections - maybe you'll see something odd
> there. Of course if you have lots of users it could also be quite
> normal. ;)
Yeah, that does seem a bit high, especially since there are less than
300-400 users on campus. I'll take a look and see.
You know, I was just about to post a follow-up to my last post... The
thought that came to mind was that things have been pretty stable over
there, and that I may have solved the problem a while back without
getting positive feedback from the resolution(s). The issue just sort of
quietly faded away and I kept thinking that I was in a holding pattern,
just waiting for it to happen again...
What I THINK happened was the following:
The email server on the DMZ is configured with djb's dnscache (part of
his djbdns package). Putting djb's dnscache on the email server with a
rather large lookup cache helps lessen the number of dns lookups for the
RBL lookups etc.
What I THINK may have caused the state table overflows, kernel panics
and reboots in the past was a combination of the following factors:
- The djbdns cache on the email was FAR too small
- The dnscache program on the email server was configured and allowed to
make dns requests from the internal dns server
- The internal dns server's lookup cache was ALSO configured to be FAR
So, what I seem to recall is that with a lot of email coming in, there
was a FLURRY of DMZ-to-Internal dns requests, each followed by an
internal-to-Internet dns request to fulfill the email server's
request... With each dns lookup traversing the m0n0wall twice.
That was an oversight (and a dumb move) on my part when I was in a rush,
so I docked myself a day's pay. lol
The box has been pretty stable lately and I am pretty sure that the only
reboots recently have been due to extended power outages.
> Well, I have to admit that I don't know what to suggest at this point.
> If the problem is really related to some odd traffic that occurs only
> once in a while and somehow messes up ipfilter, one way of hopefully
> getting a bit closer to finding out what it is would be to capture all
> traffic on the LAN interface of your m0n0wall. Wireshark has a ring
> buffer feature that allows it to capture indefinitely while consuming a
> fixed amount of disk space. Then when the problem occurs again, you
> could correlate the time of the kernel panic with the traffic just
> before it happened, and hopefully discovery something extraordinary. It
> could be a lot of data to sift through, though...
Yeah, that is a good idea. Wireshark is a great tool.
BTW, I think the output of the firewall states page of m0n0wall was
actually key in steering me towards a dns issue being the cause.
Sorry for bothering you today with this (old), apparently solved issue.
This post should probably have been posted as a response/follow-up to my
March posts so that that thread can be followed to a conclusion if
others experience a similar situation.
Paypal donation should reach you before this email does. :)
Reverse Polarity, LLC