Date: Fri, 13 Oct 2006 18:30:16 -0400 (EDT) From: "Janice M. Girouard" <jgirouar@redhat.com> Subject: [RHEL 5 PPC PATCH] RHBZ # 199129 ibmveth panics in kdump boot RHBZ#: ------ https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=199129 Description: ------------ Patch ibmveth_harden_init.patch: After a kexec the ibmveth driver will fail when trying to register with the Hypervisor because the previous kernel has not unregistered. So if the registration fails, we unregister and then try again. We don't unconditionally unregister, because we don't want to disturb the regular code path for 99% of users. Patch ibmveth_kdump_fix.patch: This patch fixes a race that panics the kernel when opening the device after a kdump. Without this patch there is a window where the hypervisor can send an interrupt before all the structures for the kdump ibmveth module are ready (because the hypervisor is not aware that the partition crashed and that the virtual driver is reloading). We close this window by disabling the interrupts before registering the adapter to the hypervisor. (This patch depends on ibmveth_harden_init.patch) RHEL Version Found: ------------------- RHEL5b1 Upstream Status: ---------------- The patches have been accepted into the mainline tree: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bbedefccc6b0da43cfaf785dac89c88bc59cb6ed http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4347ef15f76dca33ae8da769d6900a468253bda2 Test Status: ------------ This patch is based on RHEL5 beta 1. The patches were tested by performing a successful kdump on a partition with a ibmveth adapter. Proposed Patch: ---------------- ibmveth_harden_init.patch http://bugzilla.redhat.com/bugzilla/attachment.cgi?id=138464 and ibmveth_kdump_fix.patch http://bugzilla.redhat.com/bugzilla/attachment.cgi?id=138465 diff -urNp a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c --- a/drivers/net/ibmveth.c 2006-09-27 16:38:46.000000000 -0500 +++ b/drivers/net/ibmveth.c 2006-10-13 19:53:48.000000000 -0500 @@ -437,6 +437,31 @@ static void ibmveth_cleanup(struct ibmve &adapter->rx_buff_pool[i]); } +static int ibmveth_register_logical_lan(struct ibmveth_adapter *adapter, + union ibmveth_buf_desc rxq_desc, u64 mac_address) +{ + int rc, try_again = 1; + + /* After a kexec the adapter will still be open, so our attempt to + * open it will fail. So if we get a failure we free the adapter and + * try again, but only once. */ +retry: + rc = h_register_logical_lan(adapter->vdev->unit_address, + adapter->buffer_list_dma, rxq_desc.desc, + adapter->filter_list_dma, mac_address); + + if (rc != H_SUCCESS && try_again) { + do { + rc = h_free_logical_lan(adapter->vdev->unit_address); + } while (H_IS_LONG_BUSY(rc) || (rc == H_BUSY)); + + try_again = 0; + goto retry; + } + + return rc; +} + static int ibmveth_open(struct net_device *netdev) { struct ibmveth_adapter *adapter = netdev->priv; @@ -502,12 +527,7 @@ static int ibmveth_open(struct net_devic ibmveth_debug_printk("filter list @ 0x%p\n", adapter->filter_list_addr); ibmveth_debug_printk("receive q @ 0x%p\n", adapter->rx_queue.queue_addr); - - lpar_rc = h_register_logical_lan(adapter->vdev->unit_address, - adapter->buffer_list_dma, - rxq_desc.desc, - adapter->filter_list_dma, - mac_address); + lpar_rc = ibmveth_register_logical_lan(adapter, rxq_desc, mac_address); if(lpar_rc != H_SUCCESS) { ibmveth_error_printk("h_register_logical_lan failed with %ld\n", lpar_rc); diff -urNp a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c --- a/drivers/net/ibmveth.c 2006-10-13 19:55:27.000000000 -0500 +++ b/drivers/net/ibmveth.c 2006-10-13 19:55:45.000000000 -0500 @@ -527,6 +527,8 @@ static int ibmveth_open(struct net_devic ibmveth_debug_printk("filter list @ 0x%p\n", adapter->filter_list_addr); ibmveth_debug_printk("receive q @ 0x%p\n", adapter->rx_queue.queue_addr); + h_vio_signal(adapter->vdev->unit_address, VIO_IRQ_DISABLE); + lpar_rc = ibmveth_register_logical_lan(adapter, rxq_desc, mac_address); if(lpar_rc != H_SUCCESS) { __ Posted for your consideration and ACK for RHEL5