[ previous ] [ next ] [ threads ]
 From:  "Fred Mol" <fredlist at xs4all dot nl>
 To:  m0n0wall at lists dot m0n0 dot ch
 Subject:  pptp WAN link freezes
 Date:  Mon, 12 Jul 2004 18:29:05 +0200 (CEST)

I'm running m0nowall 1.1b15 with the following setup:

(              (
[adsl modem] ---(pptp)----[m0n0wall]--- LAN:
   WAN                              --- DMZ:
                                    --- Guest:

The pptp link seems to work perfectly after a reboot, but after some
time (varying from an hour to several days), the link almost freezes.
The speed becomes so low that most operations (web browsing, mail
checking) time out. Some traffic still comes through though: an ftp
upload that normally takes 6 minutes finished in 2 hours.

The situation is hard to reproduce. Downloading large files seems to
increase the chances of a freeze, though.

Some observations:
- There is no significant load on the m0n0wall: load average is usually 0.
- Traffic from LAN -> DMZ is normal
- Resetting just the pptp link, by pressing "Save" in the unchanged WAN
  configuration page, fixes the problem ... for an hour, a day, or more.
- The firewall log shows a large number of blocked ACK and FIN packets.
  For example:
10:01:29.168096 ng0 @0:35 b,80 ->,21674 PR tcp
len 20 1319 -AFP IN
10:01:32.643358 ng0 @0:35 b,80 ->,18011 PR tcp
len 20 590 -AFP IN
10:01:35.352920 ng0 @0:35 b,80 ->,13379 PR tcp
len 20 551 -AFP IN
10:01:41.614373 ng0 @0:35 b,80 ->,23592 PR tcp
len 20 243 -AFP IN
10:01:44.958001 ng0 @0:35 b,80 ->,14350 PR tcp
len 20 1500 -A IN
10:01:50.581930 ng0 @0:35 b,80 ->,23411 PR tcp
len 20 242 -AFP IN
10:01:51.157559 ng0 @0:35 b,80 ->,14350 PR tcp
len 20 1500 -A IN

- I ran tcpdump on xl0 (the WAN interface that talks to the adsl modem)
  and ng0 (pptp interface), however I find it difficult to analyze the
  output. Running etherreal on the tcpdump output, I at least managed to
  find out that:
  - Traffic volume is low: it takes about 10 minutes to capture 200
  - IP traffic from modem to m0n0wall is fragmented: most packets arrive
    in two frames with a payload of 1480 and 36 bytes.
  - There are a lot of "TCP Acked lost segment" as ethereal calls it from
    my-external-ip -> some-web-site (on xl0). These are ack's that
    ethereal can't match with a sent segment.
  - The tcpdump of ng0 shows a lot of retransmissions, duplicate acks and
    out-of-order segments.
- After resetting the connection (link now at normal speed):
  - IP traffic from modem to m0n0wall is still fragmented
  - There are still a lot of "TCP Acked lost segment" segments
  - The tcpdump of ng0 now seems "normal".

Does anyone have a clue on what can be going on, how to fix this or how
to analyze this further? I can provide the output of status.php and the
binary tcpdumps if that helps, but I thought it would be best not to
attach binary files in my first email to this list :-)


--Fred Mol