From: John Feeney <jfeeney@redhat.com> Date: Fri, 20 Nov 2009 19:37:14 -0500 Subject: [net] sched: fix panic in bnx2_poll_work Message-id: <4B06EFEA.5090500@redhat.com> Patchwork-id: 21458 O-Subject: [RHEL5.5 PATCH] Fix panic in bnx2_poll_work() Bugzilla: 526481 RH-Acked-by: Andy Gospodarek <gospo@redhat.com> RH-Acked-by: Dean Nelson <dnelson@redhat.com> RH-Acked-by: Neil Horman <nhorman@redhat.com> bz526481 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=526481 bnx2: panic in bnx2_poll_work() Description of problem: While testing nfs, bnx2 paniced with the following call stack: Call Trace: <IRQ> [<ffffffff800c69b5>] free_pages_bulk+0x1f0/0x268 [<ffffffff80148c7f>] deadline_init_queue+0xdc/0x11b [<ffffffff80088b7f>] elf_core_dump+0xc1c/0xc2c [<ffffffff881fbd0e>] :bnx2:bnx2_poll+0xdf/0x209 [<ffffffff8000c845>] net_rx_action+0xac/0x1e0 [<ffffffff8001235a>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006cb14>] do_softirq+0x2c/0x85 [<ffffffff8006c99c>] do_IRQ+0xec/0xf5 [<ffffffff8005d615>] ret_from_intr+0x0/0xa <EOI> [<ffffffff8000e38f>] mark_page_accessed+0x2/0x68 [<ffffffff8000b3e2>] __find_get_block+0x15c/0x16c [<ffffffff800076c0>] find_get_page+0x21/0x51 [<ffffffff80019a8a>] __getblk+0x1d/0x236 [<ffffffff8804daa9>] :ext3:__ext3_get_inode_loc+0x12f/0x2f9 [<ffffffff8804dca7>] :ext3:ext3_reserve_inode_write+0x23/0x90 [<ffffffff8804dd35>] :ext3:ext3_mark_inode_dirty+0x21/0x3c [<ffffffff88050c8a>] :ext3:ext3_dirty_inode+0x63/0x7b [<ffffffff80013b89>] __mark_inode_dirty+0x29/0x16e [<ffffffff8804e04f>] :ext3:ext3_generic_write_end+0x3e/0x46 [<ffffffff8804ff70>] :ext3:ext3_ordered_write_end+0xb3/0x116 [<ffffffff8000fd42>] generic_file_buffered_write+0x1cc/0x675 [<ffffffff8001651f>] __generic_file_aio_write_nolock+0x369/0x3b6 [<ffffffff8002157b>] generic_file_aio_write+0x65/0xc1 [<ffffffff8804c1b6>] :ext3:ext3_file_write+0x16/0x91 [<ffffffff8001812f>] do_sync_write+0xc7/0x104 [<ffffffff8009f7b6>] autoremove_wake_function+0x0/0x2e [<ffffffff800420a5>] do_ioctl+0x21/0x6b [<ffffffff80016927>] vfs_write+0xce/0x174 [<ffffffff800171df>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 49 8b 85 e8 00 00 00 66 83 78 06 00 74 25 8b 40 04 8d 54 05 RIP [<ffffffff881fa7c1>] :bnx2:bnx2_poll_work+0x90/0x1227 RSP <ffff81037f8d3d10> Solution: Broadcom provided a pointer to the following upstream patch. pkt_sched: Fix return value corruption in HTB and TBF. Packet schedulers should only return NET_XMIT_DROP iff the packet really was dropped. If the packet does reach the device after we return NET_XMIT_DROP then TCP can crash because it depends upon the enqueue path return values being accurate.com> upstream commit: 69747650c814a8a79fef412c7416adf823293a3e Brew: Successfully built in Brew Testing: When included in experimental kernel, panic never manifested. Ran connectathon many times when testing bnx2 and tg3 for RHEL5 development without panic too. Acks would be appreciated. Thanks. diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 9ca4189..b53b5f5 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -719,7 +719,7 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch) } else if (cl->un.leaf.q->enqueue(skb, cl->un.leaf.q) != NET_XMIT_SUCCESS) { sch->qstats.drops++; cl->qstats.drops++; - return NET_XMIT_DROP; + return ret; } else { cl->bstats.packets++; cl->bstats.bytes += skb->len; htb_activate (q,cl); @@ -753,7 +753,7 @@ static int htb_requeue(struct sk_buff *skb, struct Qdisc *sch) } else if (cl->un.leaf.q->ops->requeue(skb, cl->un.leaf.q) != NET_XMIT_SUCCESS) { sch->qstats.drops++; cl->qstats.drops++; - return NET_XMIT_DROP; + return ret; } else htb_activate (q,cl); diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index d9a5d29..fffbc4a 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -139,15 +139,8 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc* sch) struct tbf_sched_data *q = qdisc_priv(sch); int ret; - if (skb->len > q->max_size) { - sch->qstats.drops++; -#ifdef CONFIG_NET_CLS_POLICE - if (sch->reshape_fail == NULL || sch->reshape_fail(skb, sch)) -#endif - kfree_skb(skb); - - return NET_XMIT_DROP; - } + if (skb->len > q->max_size) + return qdisc_reshape_fail(skb, sch); if ((ret = q->qdisc->enqueue(skb, q->qdisc)) != 0) { sch->qstats.drops++;