[ previous ] [ next ] [ threads ]
 
 From:  Jim Thompson <jim at netgate dot com>
 To:  Chris Buechler <cbuechler at gmail dot com>
 Cc:  m0n0wall at lists dot m0n0 dot ch
 Subject:  Re: [m0n0wall] 1.2b7 lockups on Soekris 48xx
 Date:  Mon, 28 Mar 2005 11:57:30 -1000
Chris Buechler wrote:

>On Mon, 28 Mar 2005 16:41:58 +0200, Frederick Page
><fpage at thebetteros dot oche dot de> wrote:
>  
>
>>>>Might be indeed a temperature issue, the Bios freezes the machine
>>>>according to docs, when temperature gets too high. Maybe the sensor
>>>>is off in some machines.
>>>>        
>>>>
>>>One of the early things I plan to make run is the watchdog.  (At least,
>>>I haven't seen any evidence of it being
>>>enabled.)
>>>      
>>>
>>That sounds like a good idea. However, if those freezes should be
>>because the BIOS of the WRAP (mistakenly) believes the temperature is
>>too high and freezes the machine, I doubt the watchdog will be able to
>>react (when complete machine is frozen by BIOS).
>>
>>I now have the WRAP running like hell, with no problems whatsoever,
>>the temperature theory seems to be it. Maybe a sensor is off and gives
>>wrong readings? I am very glad that my WRAP does not freeze anymore.
>>
>>    
>>
>
>
>Given the problem discovered over the weekend with much higher CPU
>utilization on b5+, which of course would generate more heat from the
>CPU, this makes sense.  I don't think it should be getting hot enough
>to freeze up the WRAP though, unless it has a design flaw that it
>can't run at higher CPU utilization for very long.
>
>As far as the watchdog, it used to be enabled and was taken out long
>ago because it caused problems more than it helped.  It unnecessarily
>caused reboots, possibly amongst other problems, but check the
>archives if you want to look at it further.
>http://www.google.com/search?q=watchdog+site%3Am0n0%2Ech
>  
>

The net45xx boards (any Elan 520-based gear) have a well-understood 
problem where enabling the watchdog can (badly) glitch the internal bus 
every time the watchdog gets reset.  Typically this shows up during CF 
writes (since the CF is on the same bus), but it could happen for any 
write the passes over the "GP" bus.   There are rumors that this is 
fixed in late versions of the Elan520, but, of course, you have no 
control about which Elan you get from Soren.

There are signs of a work-around in FreeBSD that turn off bus echo mode 
while resetting the watchdog.  I don't know how well they work.  See 
elan_watchdog() in sys/i386/i386/elan-mccr.c for details.   They are 
worth a try.

This doesn't mean it makes sense to disable the watchdog on all 
hardware, either.  The watchdog seems to just work (at least under 
l***x, I've not enabled it under FreeBSD yet) on the Geode SC1100 (WRAP 
and Soekris 48xx).

jim