From: Don Dutile <ddutile@redhat.com> Date: Mon, 27 Oct 2008 12:24:19 -0400 Subject: [xen] PV: dom0 hang when device re-attached to in guest Message-id: 4905EB33.1010909@redhat.com O-Subject: [RHEL5.3 Patch] Prevent dom0 hang when PV device re-attached to in guest OS Bugzilla: 467773 RH-Acked-by: Markus Armbruster <armbru@redhat.com> RH-Acked-by: Chris Lalancette <clalance@redhat.com> RH-Acked-by: Bill Burns <bburns@redhat.com> RH-Acked-by: Bill Burns <bburns@redhat.com> RH-Acked-by: Bill Burns <bburns@redhat.com> BZ 467773 When a blktap, pv (disk) device is detached, then re-attached in a WINDOWS guest OS, it can cause the dom0/host to hang due to unreleased grant table & event channel resources. Fujitsu found and reported this problem. Fix needed in RHEL5.3 due to large Fujitsu system sales. Upstream fix in XenSource's linux-2.6.18-xen tree, cset 662. Note: same fix that is already in blkback mode code. Brew built on a -120 tree, brew build 153331 ( https://brewweb.devel.redhat.com/taskinfo?taskID=1533311 ). Fujitsu tested brew-built kernels and reported successful testing over the weekend. Please review & ack. Note: a malicous guest to do attach/detach/re-attch of blktap device & hang a dom0/host. - Don diff --git a/drivers/xen/blktap/common.h b/drivers/xen/blktap/common.h index 222c50b..991f5e4 100644 --- a/drivers/xen/blktap/common.h +++ b/drivers/xen/blktap/common.h @@ -90,6 +90,7 @@ typedef struct blkif_st { blkif_t *tap_alloc_blkif(domid_t domid); void tap_blkif_free(blkif_t *blkif); +void tap_blkif_kmem_cache_free(blkif_t *blkif); int tap_blkif_map(blkif_t *blkif, unsigned long shared_page, unsigned int evtchn); void tap_blkif_unmap(blkif_t *blkif); diff --git a/drivers/xen/blktap/interface.c b/drivers/xen/blktap/interface.c index 5a736e2..d3520d7 100644 --- a/drivers/xen/blktap/interface.c +++ b/drivers/xen/blktap/interface.c @@ -182,8 +182,15 @@ void tap_blkif_free(blkif_t *blkif) { atomic_dec(&blkif->refcnt); wait_event(blkif->waiting_to_free, atomic_read(&blkif->refcnt) == 0); + atomic_inc(&blkif->refcnt); tap_blkif_unmap(blkif); +} + +void tap_blkif_kmem_cache_free(blkif_t *blkif) +{ + if (!atomic_dec_and_test(&blkif->refcnt)) + BUG(); kmem_cache_free(blkif_cachep, blkif); } diff --git a/drivers/xen/blktap/xenbus.c b/drivers/xen/blktap/xenbus.c index 989fd8f..d8c249b 100644 --- a/drivers/xen/blktap/xenbus.c +++ b/drivers/xen/blktap/xenbus.c @@ -182,6 +182,7 @@ static int blktap_remove(struct xenbus_device *dev) kthread_stop(be->blkif->xenblkd); signal_tapdisk(be->blkif->dev_num); tap_blkif_free(be->blkif); + tap_blkif_kmem_cache_free(be->blkif); be->blkif = NULL; } kfree(be); @@ -362,6 +363,7 @@ static void tap_frontend_changed(struct xenbus_device *dev, kthread_stop(be->blkif->xenblkd); be->blkif->xenblkd = NULL; } + tap_blkif_free(be->blkif); xenbus_switch_state(dev, XenbusStateClosing); break;