|
||||||||
Hi all, I had posted a few days ago complaining about stabilit yissues concerning m0n0wall and my troubles with IP sec. m0n0wall was giving packet lost errors constantly and the resulting in a crash, which everyone and me suspected a hw layer problem. Well I disabled IPSEC on m0n0s and now the lockups have stopped happing as well as "/kernel: vr0 packet lost errors. IPSEC wasnt finishing Phase 2 anyways, now it is all stable, though no IPSEC, will find a way around somehow. Ciao, Kerem Erciyes (k underscore erciyes at zegnaermenegildo dot it) IT Sorumlusu ISMACO Amsterdam BV (+90 216 394 00 00) Ermenegildo Zegna Butik (+90 212 291 10 24) ---------------------------------------------- This message is OpenPGP Signed and content and identity of the sender can be verified with a pulic PGP key of the sender. Public PGP key can be obtained upon request. -------------------------------------------- Thursday, February 10, 2005, 2:26:58 PM, you wrote: DL> At 14:43 08/02/2005 -0800, Fred Wright wrote: >>On Tue, 8 Feb 2005, Didier Lebrun wrote: >> >> > The high latency does create some specific problems with TCP, since the >> > [bandwidth * latency] product is too big for the standard TCP window size >> > and packets loss can become dramatic in this kind of context. But the IETF >> > has developped a set of TCP extensions (RFC 1323) in order to overcome >> > these limits and have TCP perform better on LFNs (Long Fat Networks). >> > Recent versions of FreeBSD support the RFC 1323 extensions and use them >> > automatically when necessary, but the main problem is between the clients >> > and the remote servers anyway, since the TCP transaction occur between >> > ends, the gateway just letting the packets through, with the exception of >> > DNS and NTP queries. >> >>Correct. Directly supporting RFC1323 on a router isn't terribly >>important, although it *is* important that the router not screw it up, >>e.g. by failing to handle window scaling correctly in stateful filtering. >> >> > When RFC 1323 extensions are supported by the system, TCP adjusts >> itself by >> > calculating the TCP window size as soon as the first ACK comes back... but >> >>That's not exactly correct. The default socket buffer size is set >>independently from any knowledge of the peer's capability, but in the >>absence of window scaling the usable window is clamped at 65535. That may >>not even manage to avoid allocating the buffer based on the uselessly >>larger size. DL> I'm not sure to undestand what you mean here. Doesn't TCP adjust the socket DL> buffer and the TCP RWIN once it has received some ACKs, allowing it to DL> calculate the proper TCP RWIN ? >> > before that, it uses the system default values, which are often badly >> > optimized for sat links. So, it's better to tweak them on each client in >> >>They're often not even adequate for broadband to distant points. The >>theoretical minimum RTT for halfway around the planet is about 133ms. >> >> > order to have it perform better. The main principles are: >> > - enable "Window Scaling" (usually enabled by default) >> > - enable "Time Stamping" >> >>Only the above two relate to RFC1323. DL> It's true, but I didn't mean to say all options where part of 1323 ! I just DL> meant to say that they can play some role in case of sat links. >> > - enable "Selective ACKs" (SACK) >> >>Although the original "long fat pipe" package included SACK, the original >>SACK proposal was broken and hence explicitly *not* included in RFC1323. >>A reworked SACK mechanism was later published as RFC2018. In general, >>SACK support is less "mature", but fortunately it's the least important of >>the three, especially if packet loss is low. >> >> > - enable "Path MTU Discovery" >> >>This has absolutely nothing to do with RFC1323, although bulk-transfer >>efficiency on *all* links is better when the MTU can be chosen correctly >>rather than using an arbitrary "conservative" value. But then you have to >>worry about broken firewalls not forwarding the needed ICMP packets. >> >> > - set the TCP receive window size to a higher value [max >> bandwidth >> > in bytes * max latency in sec] >> >>It's actually worse than that. Although that's sufficient to avoid >>window-limited rates in the absence of packet loss, whenever a packet is >>dropped it takes *two* RTTs (plus the time to trigger fast retransmit) to >>get it through, and hence any window size smaller than twice the >>delay-bandwidth product (and then some) will diminish the effectiveness of >>fast retransmit. DL> You might be true. I noticed some problems in case of packets loss, DL> especially with WinXP clients, but couldn't figure them out. I'll have to DL> get into fast retransmit documentation to fully understand this point. >>Note that this amount of buffer space is needed *per connection* even >>though the only real requirement is for that amount of *total* oustanding >>data. This can eat up RAM pretty quickly. >> >>Also note that similar issues apply to the *send* buffer size, although >>upstream speed are usually slower. >> >> > - disable "Black Hole Detection" (MS Windows only) >> >>This has nothing to do with RFC1323. It's actually a workaround for PMTUd >>failures due to blocked ICMP. The only reason I could see for its having >>anything to do with this is if the timeout for calling the path >>"broken" is too short for a satlink. >> >> > - set "Max duplicate ACKs" = 2 (3 on Win98 only) >> >>Again that has nothing to do with RFC1323, but instead represents the fast >>retransmit threshold. RFC2581 recommends 3. Lower values reduce the >>amount of send buffer needed to avoid window stalls after dropped packets, >>but increase the risk of unnecessary retransmissions. Also note that this >>only affects *send* performance. DL> You're true in the principle, but I discussed this point with a sat DL> technician, who recommanded me to reduce the retransmit threshold, since 2 DL> RTTs is already quite big in case of sat link, and the risk of having DL> packets still arriving later than that is pretty low. I don't remember the DL> reason he gave me for the Win98 exception ? >> > On FreeBSD, you can adjust a few thing too: >> >>But I wouldn't recommend doing this to m0n0wall, since it's rarely a TCP >>endpoint and often can't afford the RAM. >> >> > If you are using FreeBSD's traffic shaping capabilities, you must >> adjust to >> > size of the queues too, in order to avoid packets drops when the queue is >> > full. You can set each download queue to the TCP receive windows size, and >> > each upload queue to the TCP sendspace. The same for the main pipes >> > (96Kbytes and 24Kbytes in our case). >> >>But the queues aren't what's filling up. The extra data that one has to >>accomodate is literally "up in the air" (or at least the vacuum). If the >>traffic shaper is just dealing with packets, it shouldn't care. If it's >>trying to be clever enough to watch TCP SEQ and ACK numbers but not clever >>enough to take large RTTs into account, then it's broken. In no case >>should it ever be necessary to buffer significant data *in the router*. >>In fact, excessive buffering in routers simply increases the overall >>delay-bandwidth product (by increasing latency) and thus requires *more* >>buffering at the endpoints. >> >>Theoretically the same argument would apply to the socket buffers, but the >>problem is that the receiver can't offer window unless it can commit to >>receiving that amount of data regardless of application behavior. The >>send-side buffer is needed because it can't be certain that the data has >>been delivered until it gets the end-to-end acknowledgment. DL> My argument might not be relevant for m0n0wall's traffic shaping ? I've not DL> studied it enough to tell. On our gateway, we use DUMMYNET + IPFW2 + NATD DL> (static firewall) with a principle of pipe sharing, whatever the bandwidth DL> is at a given moment, without setting any absolute value for each pipe or DL> queue, since I observed that setting an absolute value was increasing the DL> latency by approximately +200 to +250 ms. Each client obtains a fair share DL> of the whole pipes (upload and download), depending of how many clients are DL> using the link simultaneously, with a weight depending on ports numbers. So DL> we can have sometimes one client using the full link at it's best capacity, DL> and a whole TCP window can get stuck in any queue before TCP stops DL> transmitting more. That's why we need to set a whole TCP window size for DL> each queue in order to avoid packet drops. I did some experimental DL> observations by setting various values and looking at packets drops, and I DL> found that some drops did occur when the queue size was under 75% of the DL> TCP window, and disappeared above this value. I supposed the difference DL> between 75% and 100% was because [MAX ... * MAX ...] is overestimated. >> > Another kind of problem that can arise, is MTU/MSS miscalculations, since >> > the TCP header is 4 bytes longer than usual when using the RFC 1323 >> > extensions. They fill the 6th optional TCP header line, thus producing >> > headers of 44 bytes (20 IP + 24 TCP) instead of 40 usually (20 IP + 20 >> > TCP). It can create problems when using VPNs or any kind of encapsulation. >> >>I don't know where you get that particular number. There's an option for >>window scaling, but it appears only in the initial SYN segments. Ditto >>for the option *enabling* SACK. The timestamps option adds *12* bytes >>(including padding) to every segment. SACKs add a variable amount, but >>for most application protocols (simplex ofr half-duplex) tend to appear in >>otherwise empty packets. DL> You're probably true. I did observed 44 bytes headers, but I don't remember DL> checking whether all headers had this size. >> Fred Wright DL> Thanks for your precisions :-) DL> -- DL> Didier Lebrun DL> Le bourg - 81140 - Vaour (France) DL> tél: 05.63.53.73.41 (AM et soirée) DL> mailto:dl at vaour dot net (MIME, ISO latin 1) DL> http://didier.quartier-rural.org/ DL> --------------------------------------------------------------------- DL> To unsubscribe, e-mail: m0n0wall dash unsubscribe at lists dot m0n0 dot ch DL> For additional commands, e-mail: m0n0wall dash help at lists dot m0n0 dot ch |