From: Jiri Pirko <jpirko@redhat.com> Date: Tue, 11 Jan 2011 12:15:05 -0500 Subject: [net] tcp: fix shrinking windows with window scaling Message-id: <20110111121505.GA2688@psychotron.brq.redhat.com> Patchwork-id: 31475 O-Subject: [RHEL5.7 patch] BZ627496 net: tcp: Fix shrinking windows with window scaling Bugzilla: 627496 RH-Acked-by: Rik van Riel <riel@redhat.com> RH-Acked-by: David S. Miller <davem@redhat.com> BZ627496 https://bugzilla.redhat.com/show_bug.cgi?id=627496 Description: When selecting a new window, tcp_select_window() tries not to shrink the offered window by using the maximum of the remaining offered window size and the newly calculated window size. The newly calculated window size is always a multiple of the window scaling factor, the remaining window size however might not be since it depends on rcv_wup/rcv_nxt. This means we're effectively shrinking the window when scaling it down. The dump below shows the problem (scaling factor 2^7): - Window size of 557 (71296) is advertised, up to 3111907257: IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...> - New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes below the last end: IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...> The number 40 results from downscaling the remaining window: 3111907257 - 3111841425 = 65832 65832 / 2^7 = 514 65832 % 2^7 = 40 If the sender uses up the entire window before it is shrunk, this can have chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq() will notice that the window has been shrunk since tcp_wnd_end() is before tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number. This will fail the receivers checks in tcp_sequence() however since it is before it's tp->rcv_wup, making it respond with a dupack. If both sides are in this condition, this leads to a constant flood of ACKs until the connection times out. Make sure the window is never shrunk by aligning the remaining window to the window scaling factor. Upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=607bfbf2d55dd1cfe5368b41c2a81a8c9ccf4723 Brew: https://brewweb.devel.redhat.com/taskinfo?taskID=3022117 Jirka Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 2c9546b..0a88bbc 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -246,7 +246,7 @@ static u16 tcp_select_window(struct sock *sk) * * Relax Will Robinson. */ - new_win = cur_win; + new_win = ALIGN(cur_win, 1 << tp->rx_opt.rcv_wscale); } tp->rcv_wnd = new_win; tp->rcv_wup = tp->rcv_nxt;