|
||||||||||
Manuel Kasper wrote: > On 16.05.2008, at 20:01, mtnbkr wrote: > >> 3648 active > > I can't really see anything out of the ordinary in your vmstat/ipfstat > output; however, given the strict ruleset that you described, I find > 3648 active connections quite a lot. You could try "ipfstat -ls" to see > the full list of active connections - maybe you'll see something odd > there. Of course if you have lots of users it could also be quite > normal. ;) Yeah, that does seem a bit high, especially since there are less than 300-400 users on campus. I'll take a look and see. You know, I was just about to post a follow-up to my last post... The thought that came to mind was that things have been pretty stable over there, and that I may have solved the problem a while back without getting positive feedback from the resolution(s). The issue just sort of quietly faded away and I kept thinking that I was in a holding pattern, just waiting for it to happen again... What I THINK happened was the following: The email server on the DMZ is configured with djb's dnscache (part of his djbdns package). Putting djb's dnscache on the email server with a rather large lookup cache helps lessen the number of dns lookups for the RBL lookups etc. What I THINK may have caused the state table overflows, kernel panics and reboots in the past was a combination of the following factors: - The djbdns cache on the email was FAR too small - The dnscache program on the email server was configured and allowed to make dns requests from the internal dns server - The internal dns server's lookup cache was ALSO configured to be FAR too small So, what I seem to recall is that with a lot of email coming in, there was a FLURRY of DMZ-to-Internal dns requests, each followed by an internal-to-Internet dns request to fulfill the email server's request... With each dns lookup traversing the m0n0wall twice. That was an oversight (and a dumb move) on my part when I was in a rush, so I docked myself a day's pay. lol The box has been pretty stable lately and I am pretty sure that the only reboots recently have been due to extended power outages. > Well, I have to admit that I don't know what to suggest at this point. > If the problem is really related to some odd traffic that occurs only > once in a while and somehow messes up ipfilter, one way of hopefully > getting a bit closer to finding out what it is would be to capture all > traffic on the LAN interface of your m0n0wall. Wireshark has a ring > buffer feature that allows it to capture indefinitely while consuming a > fixed amount of disk space. Then when the problem occurs again, you > could correlate the time of the kernel panic with the traffic just > before it happened, and hopefully discovery something extraordinary. It > could be a lot of data to sift through, though... Yeah, that is a good idea. Wireshark is a great tool. BTW, I think the output of the firewall states page of m0n0wall was actually key in steering me towards a dns issue being the cause. Sorry for bothering you today with this (old), apparently solved issue. This post should probably have been posted as a response/follow-up to my March posts so that that thread can be followed to a conclusion if others experience a similar situation. Paypal donation should reach you before this email does. :) -- Bill Arlofski Reverse Polarity, LLC |