[ previous ] [ next ] [ threads ]
 
 From:  "David Burgess" <apt dot get at gmail dot com>
 To:  "Monowall Support List" <m0n0wall at lists dot m0n0 dot ch>
 Subject:  WAN fails to connect
 Date:  Wed, 8 Aug 2007 22:13:07 -0600
Hi,

I have three monowalls running nearly identical hardware. One of them
has been acting up lately and I'm having trouble pinning down the
nature of the problem.

The hardware (all 8 months old or less):

sempron 1.6 GHz
512 MB RAM
WAN - nvidia onboard GBE (nve driver)
LAN - intel pro 1000 gt pci (em)
OPT1 - intel pro 1000 pt pci-e 1x (em) (just added new last week)

I was using mono 1.3b2 from January until yesterday (7 August). On
Thursday (2 Aug) we added the third nic and bridged it to the WAN.
OPT1, although active, hasn't actually seen use in the field yet.

We have about 250 customers on the LAN and as of Thursday 2 Aug a
handful started complaining of intermittent connectivity. Thinking
there was an issue with one of the nics we swapped em0 and em1 for LAN
and OPT1 but the complaints continued.

Yesterday (7 Aug) we replaced the router with an identical spare
(except OPT 1 is a DLink pci rather than an intel pci-e) and we've had
no complaints since.

I connected the suspect router to my home network. memtest86 ran for 8
hours without a single error. I pxe booted into ubuntu (ltsp) using
the nve card and then the pci-e card and ran steady traffic for a few
hours on each without a hiccup.  I then booted back into monowall and
ran several rounds of iperf through it (WAN-LAN) at near-gigabit
speeds.

Then suddenly after several rounds of iperf, near the end of a
30-second test the WAN stopped responding. No packets would go in or
out of it. I reassigned each of the three interfaces as WAN with a
reboot or two in turn, but each time the WAN was unresponsive, so the
problem doesn't appear to be tied to any specific hardware interface.

The Status: Interfaces page shows the WAN and dhcp status both as "up"
with a dhcp release button, which effects no change when pressed. The
IP address is displayed as 0.0.0.0/8. The media is correctly displayed
as 1000baseTX full duplex. In/out packets show 0/2. ifconfig output is
in agreement with the status page. The upstream dhcp server shows no
evidence of a bootp request from mono.

When I assign the WAN ip address statically the Status: Interfaces
page appears 'healthy' but still no response from the WAN.

I thought perhaps I had some corruption on my CF card, but having no
handy access to a card reader to reflash it, I instead did a firmware
update and a "factory reset" via the gui,but the trouble symptoms
remain identical.

So what would the wise readers of the mailing list suggest? Does this
look like a CF issue that a total reflash would/wouldn't correct? Does
it look more like a hardware issue? How could I narrow this down?

All hypotheses gratefully considered.

db