[ previous ] [ next ] [ threads ]
 From:  Chris Buechler <cbuechler at gmail dot com>
 Cc:  m0n0wall at lists dot m0n0 dot ch
 Subject:  Re: [m0n0wall] Speed/Overhead Question
 Date:  Thu, 14 Jul 2005 11:11:04 -0400
On 7/14/05, A dot L dot M dot Buxey at lboro dot ac dot uk <A dot L dot M dot Buxey at lboro dot ac dot uk> wrote:
> Hi,
> > http://chrisbuechler.com/4vs5/
> not at all - you may find that 5.x supports the VIA so much better and
> has better PCI code etc that things affect the result. the ONLY way that this
> could be tested is to run a FreeBSD 4.11 system on the VIA board to make
> a valid comparison. otherwise its apples v's oranges.

No.  I've actually tested the same model box Holger mentioned on 4.x,
5.x, and 6.x.  There is no difference between any of them (it's not
CPU-bound at the high end, like slower hardware gets with 5.x and

I personally spoke to Robert Watson of the FreeBSD core team about 5.x
and 6.x network performance, via email, and in person up at BSDCan. 
In short, they know network performance is down significantly,
especially in firewall scenarios.  In most situations it isn't a big
deal, but firewalling combined with relatively slow hardware makes for
a huge hit, hence our move back to 4.x for m0n0wall for the 1.2

A few snips of an email he sent me regarding this:

As you are probably aware, one of the largest single architectural changes
between the FreeBSD 4.x and 5.x branches was the adoption of the SMPng
architecture.  This architecture introduces finer-grained locking and
ithreads, and permits much more parallelism in the kernel as compared to
FreeBSD 4.x, due to the removal of the Giant lock.  However, this
architectural change comes at a price: adding more locks adds overhead.
For some work loads, SMPng in 5.x is a clear win -- the increased
parallelism and preemptability of the kernel (even on UP) leads to a
substantial improvement in performance.  However, with other workloads,
the cost of additional atomic operations for locks is a significant

The network stack is probably the most sensitive part of the kernel when
it comes to observing the overhead of individual instructions -- functions
in the network stack are often run 2 or more times per packet received and
processed, especially in firewall scenarios.  Events also happen on the
order of millions of times per second, and in most gigabit network
environments, the system is CPU-bound.  Likewise, the move to ithreads has
added overhead in interrupt handling that can be quite sizable in terms of
measurable latency.

Part of the challenge here is that we've set ourselves a tremendously high
bar: the 4.x network stack is probably the fastest general purpose network
stack in the world, able to route packets, etc, an order of magnitude
faster than many competing systems.  Our intent is to recover almost all
UP performance for the network stack, and to substantially out-perform 4.x
on SMP.  We're still working on that, though :-).  

He said that work is being done in 6 that will probably make it into
5.5 to improve things, but the biggest improvement should be in 6.x.

Personally, I get the impression they're pushing 5.x through as sort
of a transition from 4.x to 6.x.  I have much more hope in 6.x than I
have had for 5.x, from a firewall perspective.  My 5.x servers are
great, but I've been much less than impressed for firewall purposes,
even on -CURRENT (now RELENG_6) thus far as far as throughput on
slower hardware goes.