Sophie: kernel-2.6.18-238.19.1.el5.centos.plus src

kernel-2.6.18-238.19.1.el5.centos.plus.src.rpm

From: John Feeney <jfeeney@redhat.com>
Date: Fri, 20 Nov 2009 19:37:14 -0500
Subject: [net] sched: fix panic in bnx2_poll_work
Message-id: <4B06EFEA.5090500@redhat.com>
Patchwork-id: 21458
O-Subject: [RHEL5.5 PATCH] Fix panic in bnx2_poll_work()
Bugzilla: 526481
RH-Acked-by: Andy Gospodarek <gospo@redhat.com>
RH-Acked-by: Dean Nelson <dnelson@redhat.com>
RH-Acked-by: Neil Horman <nhorman@redhat.com>

bz526481
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=526481
bnx2: panic in bnx2_poll_work()

Description of problem:
While testing nfs, bnx2 paniced with the following call
stack:

Call Trace:
 <IRQ>  [<ffffffff800c69b5>] free_pages_bulk+0x1f0/0x268
 [<ffffffff80148c7f>] deadline_init_queue+0xdc/0x11b
 [<ffffffff80088b7f>] elf_core_dump+0xc1c/0xc2c
 [<ffffffff881fbd0e>] :bnx2:bnx2_poll+0xdf/0x209
 [<ffffffff8000c845>] net_rx_action+0xac/0x1e0
 [<ffffffff8001235a>] __do_softirq+0x89/0x133
 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
 [<ffffffff8006cb14>] do_softirq+0x2c/0x85
 [<ffffffff8006c99c>] do_IRQ+0xec/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff8000e38f>] mark_page_accessed+0x2/0x68
 [<ffffffff8000b3e2>] __find_get_block+0x15c/0x16c
 [<ffffffff800076c0>] find_get_page+0x21/0x51
 [<ffffffff80019a8a>] __getblk+0x1d/0x236
 [<ffffffff8804daa9>] :ext3:__ext3_get_inode_loc+0x12f/0x2f9
 [<ffffffff8804dca7>] :ext3:ext3_reserve_inode_write+0x23/0x90
 [<ffffffff8804dd35>] :ext3:ext3_mark_inode_dirty+0x21/0x3c
 [<ffffffff88050c8a>] :ext3:ext3_dirty_inode+0x63/0x7b
 [<ffffffff80013b89>] __mark_inode_dirty+0x29/0x16e
 [<ffffffff8804e04f>] :ext3:ext3_generic_write_end+0x3e/0x46
 [<ffffffff8804ff70>] :ext3:ext3_ordered_write_end+0xb3/0x116
 [<ffffffff8000fd42>] generic_file_buffered_write+0x1cc/0x675
 [<ffffffff8001651f>] __generic_file_aio_write_nolock+0x369/0x3b6
 [<ffffffff8002157b>] generic_file_aio_write+0x65/0xc1
 [<ffffffff8804c1b6>] :ext3:ext3_file_write+0x16/0x91
 [<ffffffff8001812f>] do_sync_write+0xc7/0x104
 [<ffffffff8009f7b6>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800420a5>] do_ioctl+0x21/0x6b
 [<ffffffff80016927>] vfs_write+0xce/0x174
 [<ffffffff800171df>] sys_write+0x45/0x6e
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 49 8b 85 e8 00 00 00 66 83 78 06 00 74 25 8b 40 04 8d 54 05
RIP  [<ffffffff881fa7c1>] :bnx2:bnx2_poll_work+0x90/0x1227
 RSP <ffff81037f8d3d10>



Solution:
Broadcom provided a pointer to the following upstream patch.


pkt_sched: Fix return value corruption in HTB and TBF.


Packet schedulers should only return NET_XMIT_DROP iff
the packet really was dropped.  If the packet does reach
the device after we return NET_XMIT_DROP then TCP can
crash because it depends upon the enqueue path return
values being accurate.com>

upstream commit:
69747650c814a8a79fef412c7416adf823293a3e

Brew:
Successfully built in Brew

Testing:
When included in experimental kernel, panic never manifested.
Ran connectathon many times when testing bnx2 and tg3 for
RHEL5 development without panic too.

Acks would be appreciated. Thanks.

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 9ca4189..b53b5f5 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -719,7 +719,7 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch)
     } else if (cl->un.leaf.q->enqueue(skb, cl->un.leaf.q) != NET_XMIT_SUCCESS) {
 	sch->qstats.drops++;
 	cl->qstats.drops++;
-	return NET_XMIT_DROP;
+	return ret;
     } else {
 	cl->bstats.packets++; cl->bstats.bytes += skb->len;
 	htb_activate (q,cl);
@@ -753,7 +753,7 @@ static int htb_requeue(struct sk_buff *skb, struct Qdisc *sch)
     } else if (cl->un.leaf.q->ops->requeue(skb, cl->un.leaf.q) != NET_XMIT_SUCCESS) {
 	sch->qstats.drops++;
 	cl->qstats.drops++;
-	return NET_XMIT_DROP;
+	return ret;
     } else 
 	    htb_activate (q,cl);
 
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index d9a5d29..fffbc4a 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -139,15 +139,8 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 	struct tbf_sched_data *q = qdisc_priv(sch);
 	int ret;
 
-	if (skb->len > q->max_size) {
-		sch->qstats.drops++;
-#ifdef CONFIG_NET_CLS_POLICE
-		if (sch->reshape_fail == NULL || sch->reshape_fail(skb, sch))
-#endif
-			kfree_skb(skb);
-
-		return NET_XMIT_DROP;
-	}
+	if (skb->len > q->max_size)
+		return qdisc_reshape_fail(skb, sch);
 
 	if ((ret = q->qdisc->enqueue(skb, q->qdisc)) != 0) {
 		sch->qstats.drops++;