From: Jesse Larrew <jlarrew@redhat.com> Date: Thu, 14 May 2009 00:59:39 -0400 Subject: [scsi] Retry mode select in rdac device handler Message-id: 20090514045648.10947.8280.sendpatchset@squad5-lp1.lab.bos.redhat.com O-Subject: [PATCH RHEL5.4 1/10 BZ489582] Retry mode select in rdac device handler Bugzilla: 489582 RH-Acked-by: Mike Christie <mchristi@redhat.com> RHBZ#: ====== https://bugzilla.redhat.com/show_bug.cgi?id=489582 Description: =========== This is a bug fix for all archs. When the mode select sent to the controller fails with the retryable error, it is better to retry the mode_select from the hardware handler itself instead of propagating the failure to dm-multipath. RHEL Version Found: ================ RHEL 5.3 kABI Status: ============ No symbols were harmed. Brew: ===== Built on all platforms. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1794596 Upstream Status: ================ commit: c85f8cb9254e60cd25a094329c9dc9185c2140e7 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c85f8cb9254e60cd25a094329c9dc9185c2140e7 Test Status: ============ This patch has been tested by Chandra Seetharaman at IBM (sekharan@us.ibm.com). Testing methodology: - Configure DS4K storage such that there is at least 1 active path and 1 passive path access from the server. - Start some I/Os, like disk test on any of the multipath devices. - Bring the switch port connected to the active controller down and up (at least after 1 minute). - Failover _may_ fail due to the failure of mode select failing with some retryable errors. (In RHEL 5.2 this won't happen). With this patch applied, failover completes properly. =============================================================== Jesse Larrew IBM Onsite Partner 978-392-3183 jlarrew@redhat.com Proposed Patch: =============== This patch is based on 2.6.18-136.el5. diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c index 7fa7f51..e33302e 100644 --- a/drivers/scsi/device_handler/scsi_dh_rdac.c +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c @@ -27,6 +27,7 @@ #include "../scsi_priv.h" #define RDAC_NAME "rdac" +#define RDAC_RETRY_COUNT 3 /* * LSI mode page stuff @@ -480,21 +481,27 @@ static int send_mode_select(struct scsi_device *sdev, struct rdac_dh_data *h) { struct request *rq; struct request_queue *q = sdev->request_queue; - int err = SCSI_DH_RES_TEMP_UNAVAIL; + int err, retry_cnt = RDAC_RETRY_COUNT; +retry: + err = SCSI_DH_RES_TEMP_UNAVAIL; rq = rdac_failover_get(sdev, h); if (!rq) goto done; - sdev_printk(KERN_INFO, sdev, "queueing MODE_SELECT command.\n"); + sdev_printk(KERN_INFO, sdev, "%s MODE_SELECT command.\n", + (retry_cnt == RDAC_RETRY_COUNT) ? "queueing" : "retrying"); err = blk_execute_rq(q, NULL, rq, 1); - if (err != SCSI_DH_OK) + blk_put_request(rq); + if (err != SCSI_DH_OK) { err = mode_select_handle_sense(sdev, h->sense); + if (err == SCSI_DH_RETRY && retry_cnt--) + goto retry; + } if (err == SCSI_DH_OK) h->state = RDAC_STATE_ACTIVE; - blk_put_request(rq); done: return err; }