[ previous ] [ next ] [ threads ]
 From:  Fred Wright <fw at well dot com>
 To:  m0n0wall dash dev at lists dot m0n0 dot ch
 Subject:  Re: [m0n0wall-dev] Increasing ipfilter NAT/state table sizes by default?
 Date:  Sat, 2 Oct 2004 19:14:50 -0700 (PDT)
On Fri, 1 Oct 2004, Manuel Kasper wrote:
> On 01.10.2004 00:30 -0700, Fred Wright wrote:
> > Note that with the current parameters, the 30000 NAT entries
> > allowed would tie up about 7.5MB, which is a bit much on a 32M
> > Soekris.  There's some code to reduce the limit if it has trouble
> > with allocations, but I suspect the system would be pretty sick by
> > the time it actually reached that code. So this limit may actually
> > be too *large* for the Soekris (and perhaps even WRAP) builds.  The
> > 4013 filter entries would use about 1MB.
> What about 64 MB boxes? I mean, increasing IPSTATE_SIZE/IPSTATE_MAX
> as well as NAT_TABLE_SIZE/NAT_TABLE_MAX alone to handle around ~50000
> entries would only cost a few hundred KB extra. But since it's then
> possible to go beyond ~4000 states, the 32 MB box could run out of
> memory during very heavy use and crash/panic/whatever. If we can be
> sure that it's no problem on a 64 MB box, I'd say let's go ahead. I
> don't really care about the 32 MB boxes, and as long as they work
> more or less (without webGUI firmware updates, and maybe with
> problems when tens of thousands of connections are established), it's
> OK with me. I don't want a few net4511s to limit the "recommended
> setup" (which is 64+ MB). Also, a "one size fits all" value would
> save another tuneable parameter in the config that many people
> wouldn't understand. People with extreme high volume setups can still
> compile their own kernel if they decide to use m0n0wall.

Except that worrying about the 30K limit seems misdirected when it's more
or less impossible to exceed 4K in the current config, and only a small
subset of users seem to have trouble even with that.  Setting both to,
e.g., 16K would quadruple the current connection capacity while limiting
the memory consumption to 8MB.

Perhaps the NAT hash size should be increased to the same 5737 used by the
filter; I'm not convinced that going bigger than that is justifiable,
since systems with enough connections to have a lot of collisions at that
size probably need to have more horsepower, anyway.

Unfortunately IPFilter doesn't track high-water marks for the tables, so
there's no easy way to get an idea of the worst-case usage in actual
practice.  It *does* track the actual failures, so people suspecting that
trouble could check (unless it's killed the WebGUI).  The filter code
tracks limit-exceeded failures and allocation failures separately
("maximum" and "no memory" from "ipfstat -s", respectively), while the NAT
code lumps both cases together (as "no memory" from "ipnat -s").  Though
if the system isn't *really* out of memory, it should be safe to assume
that the limit is the cause.

Speaking of killing the WebGUI, it occurs to me that it might be desirable
for the automatic rule that guarantees LAN access to the WebGUI *not* to
use stateful filtering, for this very reason (it already avoids NAT).  
That would require moving it ahead of the rule pair that blocks
non-initial-SYN TCP packets (or duplicating the latter in each interface
group).  It would have the added benefit of allowing one to look at the
filter status without altering it (other than the hit counts).

> Memory usage on my net4501 is usually on the order of 33%, so that
> should leave 25 megs for 50000 NAT and state table entries in the
> extreme case (while it's doubtful anyway if a net4501 would be fast
> enough to handle that kind of load). I guess I'll just ship the next
> beta with a NAT/state max. of ~50000, and then people can test.

However, a system that busy would probably need more mbufs, and there
might be some other less obvious memory costs as well.  I expect that it
would be oversimplified to extrapolate memory usage from a lightly-loaded
configuration based solely on the 512 bytes per connection.  And if the
system really did start to run out of memory, it could have symptoms that
are much more erratic and difficult to diagnose than filter/NAT failures.

Also, in the present architecture, every new feature eats RAM, whether
it's used or not, so the amount of RAM available today may not be the
amount available tomorrow.

					Fred Wright