> This issues has been brought up before. Monowall/freebsd seems to have a
> bug that may cause a sporadic total OS freeze/lockup. I'm bringing this up
> again because i have an installation running monowall version 1.22, which
> freezes solid every once in a while.
>
> Initially i used a SOEKRIS net 4501 hardware platform, but the system
> froze approx. every 2-3 hours. I replaced the haw with a SOEKRIS net 4801
> (replaced the PSU too) and have just seen a lockup after 7 days of uptime.
> There's nothing in the setup out of the ordinary.
First, I'd like to say I'm damn happy that I'm not the only one still seeing
this. And my write-up here is long. Sorry.
I have done some testing over the last few months with regards to this
problem. I currently have 4 m0n0wall boxes in service on various parts of
one very large LAN. The one that cause me problems is the one that has the
highest traffic load. The only special thing about this unit is it has 12
VLANs configured on OPT1. It serves approx. 200 clients to the Internet
(including our WISP traffic), DHCP on one VLAN serves 30 addresses and
around 30 static. Another VLAN used for network management traffic has
approx. 100 nodes behind it but these are mostly local traffic that does not
pass through the m0n0wall. Through-traffic rarely exceeds 4 Mbps one way.
I have tried a number of different hardware platforms in this position. My
original hardware was a Lex Systems CV860A, 800 MHz with Realtek NICs. At
first I suspected the NICs. Under the guidance of several mailing list
members I purchased the same hardware with a 1 GHz processor and Intel NICs.
Same issue. I then purchased higher quality CF cards (128 MB) but the
problem still occurred. After that I went to an old generic tower PC - P3
with 256 Mb of RAM, using an old hard drive. Of course it happened again.
My next try was the CD and floppy configuration. You guessed it, same
thing. Also, this whole time my others units, all CV860As with Realtek
NICs, have been running great with 90+ days of uptime.
This particular unit would lock up every 6-12 days. No indication of why,
no messages on the console, nothing in the syslogs. Every lock-up would
present a completely unresponsive box - could not ping any interface,
console would not respond, even unplugging the Ethernet cable the lights for
the port would stay on. Weird. A hard reboot was the only option. Then
the thought occurred to me that I remember it lasting longer when a config
change was made. So I started making a config change every couple of days -
small change like increasing or decreasing the number of log entries shown.
This got me up to 28 days. Wow! I haven't seen that in a long time.
I also seem to remember this fun little PITA showing up after I upgraded to
1.2. I also remember reading someone's post about this issue. He said he
encountered that same thing when upgrading to 1.2. He ended up rebuilding
his configuration from scratch on a box running 1.2 already. So I decided
to try something else. I burned a CD with 1.11 on it and rebooted my
machine with it in using my existing config file. I have now been running
for 11 days with config changes at all. I know, 11 days, wahoo. It's not
much in the real world, but it is for me, right now. I am in the process of
building my config from scratch with 1.22 on one of the CV860A, 1 GHz, 512
MB RAM, 128 MB CF card, Intel NICs. I figure if I get up to 20+ days uptime
with the 1.11 CD/floppy, I will swap the new 1.22 box in and see where it
gets to.
This has been my experience so far. Thanks for reading. And I still love
m0n0wall......
Aaron |