|
||||||||
Hi, I have three monowalls running nearly identical hardware. One of them has been acting up lately and I'm having trouble pinning down the nature of the problem. The hardware (all 8 months old or less): sempron 1.6 GHz 512 MB RAM WAN - nvidia onboard GBE (nve driver) LAN - intel pro 1000 gt pci (em) OPT1 - intel pro 1000 pt pci-e 1x (em) (just added new last week) I was using mono 1.3b2 from January until yesterday (7 August). On Thursday (2 Aug) we added the third nic and bridged it to the WAN. OPT1, although active, hasn't actually seen use in the field yet. We have about 250 customers on the LAN and as of Thursday 2 Aug a handful started complaining of intermittent connectivity. Thinking there was an issue with one of the nics we swapped em0 and em1 for LAN and OPT1 but the complaints continued. Yesterday (7 Aug) we replaced the router with an identical spare (except OPT 1 is a DLink pci rather than an intel pci-e) and we've had no complaints since. I connected the suspect router to my home network. memtest86 ran for 8 hours without a single error. I pxe booted into ubuntu (ltsp) using the nve card and then the pci-e card and ran steady traffic for a few hours on each without a hiccup. I then booted back into monowall and ran several rounds of iperf through it (WAN-LAN) at near-gigabit speeds. Then suddenly after several rounds of iperf, near the end of a 30-second test the WAN stopped responding. No packets would go in or out of it. I reassigned each of the three interfaces as WAN with a reboot or two in turn, but each time the WAN was unresponsive, so the problem doesn't appear to be tied to any specific hardware interface. The Status: Interfaces page shows the WAN and dhcp status both as "up" with a dhcp release button, which effects no change when pressed. The IP address is displayed as 0.0.0.0/8. The media is correctly displayed as 1000baseTX full duplex. In/out packets show 0/2. ifconfig output is in agreement with the status page. The upstream dhcp server shows no evidence of a bootp request from mono. When I assign the WAN ip address statically the Status: Interfaces page appears 'healthy' but still no response from the WAN. I thought perhaps I had some corruption on my CF card, but having no handy access to a card reader to reflash it, I instead did a firmware update and a "factory reset" via the gui,but the trouble symptoms remain identical. So what would the wise readers of the mailing list suggest? Does this look like a CF issue that a total reflash would/wouldn't correct? Does it look more like a hardware issue? How could I narrow this down? All hypotheses gratefully considered. db |