From: Peter Martuccelli <peterm@redhat.com> Subject: [RHEL 5.1][Patch 0/19] IPMI: Intro to patch set Date: Fri, 1 Jun 2007 15:04:32 -0400 Bugzilla: 241928 212415 231436 Message-Id: <200706011904.l51J4Vih014269@redrum.boston.redhat.com> Changelog: [ipmi] update to latest Hello, This is an introduction for the 19 IPMI patches being submitted for RHEL 5.1. I broke the patches apart in order to present them in a more understandable format. Patch #18 is a micellaneuos collection of fixes, which grew a little larger then I had anticipated. All of the patches are upstream as of 2.6.21, except for the last one which is in 2.6.22-rc3. This patch set resolves Bugzillas 241928, 212415, and 231436. Testing: Test kernels were built in brew and tested by various onsite partners. Dell verified the kernel changes, specifically the kipmid issue on their newer systems set to ship in the fall. IBM and HP also tested the kernel changes with no problems reported. Regards, Peter =================== diff -Naurp linux-2.6.18.noarch/Documentation/IPMI.txt linux-ipmi/Documentation/IPMI.txt --- linux-2.6.18.noarch/Documentation/IPMI.txt 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/Documentation/IPMI.txt 2007-06-11 15:23:49.000000000 -0400 @@ -326,9 +326,12 @@ for events, they will all receive all ev For receiving commands, you have to individually register commands you want to receive. Call ipmi_register_for_cmd() and supply the netfn -and command name for each command you want to receive. Only one user -may be registered for each netfn/cmd, but different users may register -for different commands. +and command name for each command you want to receive. You also +specify a bitmask of the channels you want to receive the command from +(or use IPMI_CHAN_ALL for all channels if you don't care). Only one +user may be registered for each netfn/cmd/channel, but different users +may register for different commands, or the same command if the +channel bitmasks do not overlap. From userland, equivalent IOCTLs are provided to do these functions. @@ -361,6 +364,8 @@ You can change this at module load time regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,... regshifts=<shift1>,<shift2>,... slave_addrs=<addr1>,<addr2>,... + force_kipmid=<enable1>,<enable2>,... + unload_when_empty=[0|1] Each of these except si_trydefaults is a list, the first item for the first interface, second item for the second interface, etc. @@ -406,6 +411,17 @@ The slave_addrs specifies the IPMI addre usually 0x20 and the driver defaults to that, but in case it's not, it can be specified when the driver starts up. +The force_ipmid parameter forcefully enables (if set to 1) or disables +(if set to 0) the kernel IPMI daemon. Normally this is auto-detected +by the driver, but systems with broken interrupts might need an enable, +or users that don't want the daemon (don't need the performance, don't +want the CPU hit) can disable it. + +If unload_when_empty is set to 1, the driver will be unloaded if it +doesn't find any interfaces or all the interfaces fail to work. The +default is one. Setting to 0 is useful with the hotmod, but is +obviously only useful for modules. + When compiled into the kernel, the addresses can be specified on the kernel command line as: @@ -416,6 +432,7 @@ kernel command line as: ipmi_si.regsizes=<size1>,<size2>,... ipmi_si.regshifts=<shift1>,<shift2>,... ipmi_si.slave_addrs=<addr1>,<addr2>,... + ipmi_si.force_kipmid=<enable1>,<enable2>,... It works the same as the module parameters of the same names. @@ -430,6 +447,25 @@ have high-res timers enabled in the kern interrupts enabled, the driver will run VERY slowly. Don't blame me, these interfaces suck. +The driver supports a hot add and remove of interfaces. This way, +interfaces can be added or removed after the kernel is up and running. +This is done using /sys/modules/ipmi_si/hotmod, which is a write-only +parameter. You write a string to this interface. The string has the +format: + <op1>[:op2[:op3...]] +The "op"s are: + add|remove,kcs|bt|smic,mem|i/o,<address>[,<opt1>[,<opt2>[,...]]] +You can specify more than one interface on the line. The "opt"s are: + rsp=<regspacing> + rsi=<regsize> + rsh=<regshift> + irq=<irq> + ipmb=<ipmb slave addr> +and these have the same meanings as discussed above. Note that you +can also use this on the kernel command line for a more compact format +for specifying an interface. Note that when removing an interface, +only the first three parameters (si type, address type, and address) +are used for the comparison. Any options are ignored for removing. The SMBus Driver ---------------- @@ -491,7 +527,10 @@ used to control it: modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type> preaction=<preaction type> preop=<preop type> start_now=x - nowayout=x + nowayout=x ifnum_to_use=n + +ifnum_to_use specifies which interface the watchdog timer should use. +The default is -1, which means to pick the first one registered. The timeout is the number of seconds to the action, and the pretimeout is the amount of seconds before the reset that the pre-timeout panic will @@ -613,5 +652,9 @@ command line. The parameter is also ava in /proc/sys/dev/ipmi/poweroff_powercycle. Note that if the system does not support power cycling, it will always do the power off. +The "ifnum_to_use" parameter specifies which interface the poweroff +code should use. The default is -1, which means to pick the first one +registered. + Note that if you have ACPI enabled, the system will prefer using ACPI to power off. diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_bt_sm.c linux-ipmi/drivers/char/ipmi/ipmi_bt_sm.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_bt_sm.c 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_bt_sm.c 2007-06-11 15:23:49.000000000 -0400 @@ -33,11 +33,15 @@ #include <linux/ipmi_msgdefs.h> /* for completion codes */ #include "ipmi_si_sm.h" -static int bt_debug = 0x00; /* Production value 0, see following flags */ +#define BT_DEBUG_OFF 0 /* Used in production */ +#define BT_DEBUG_ENABLE 1 /* Generic messages */ +#define BT_DEBUG_MSG 2 /* Prints all request/response buffers */ +#define BT_DEBUG_STATES 4 /* Verbose look at state changes */ +/* BT_DEBUG_OFF must be zero to correspond to the default uninitialized + value */ + +static int bt_debug; /* 0 == BT_DEBUG_OFF */ -#define BT_DEBUG_ENABLE 1 -#define BT_DEBUG_MSG 2 -#define BT_DEBUG_STATES 4 module_param(bt_debug, int, 0644); MODULE_PARM_DESC(bt_debug, "debug bitmask, 1=enable, 2=messages, 4=states"); @@ -47,38 +51,54 @@ MODULE_PARM_DESC(bt_debug, "debug bitmas Since the Open IPMI architecture is single-message oriented at this stage, the queue depth of BT is of no concern. */ -#define BT_NORMAL_TIMEOUT 5000000 /* seconds in microseconds */ -#define BT_RETRY_LIMIT 2 -#define BT_RESET_DELAY 6000000 /* 6 seconds after warm reset */ +#define BT_NORMAL_TIMEOUT 5 /* seconds */ +#define BT_NORMAL_RETRY_LIMIT 2 +#define BT_RESET_DELAY 6 /* seconds after warm reset */ + +/* States are written in chronological order and usually cover + multiple rows of the state table discussion in the IPMI spec. */ enum bt_states { - BT_STATE_IDLE, + BT_STATE_IDLE = 0, /* Order is critical in this list */ BT_STATE_XACTION_START, BT_STATE_WRITE_BYTES, - BT_STATE_WRITE_END, BT_STATE_WRITE_CONSUME, - BT_STATE_B2H_WAIT, - BT_STATE_READ_END, - BT_STATE_RESET1, /* These must come last */ + BT_STATE_READ_WAIT, + BT_STATE_CLEAR_B2H, + BT_STATE_READ_BYTES, + BT_STATE_RESET1, /* These must come last */ BT_STATE_RESET2, BT_STATE_RESET3, BT_STATE_RESTART, - BT_STATE_HOSED + BT_STATE_PRINTME, + BT_STATE_CAPABILITIES_BEGIN, + BT_STATE_CAPABILITIES_END, + BT_STATE_LONG_BUSY /* BT doesn't get hosed :-) */ }; +/* Macros seen at the end of state "case" blocks. They help with legibility + and debugging. */ + +#define BT_STATE_CHANGE(X,Y) { bt->state = X; return Y; } + +#define BT_SI_SM_RETURN(Y) { last_printed = BT_STATE_PRINTME; return Y; } + struct si_sm_data { enum bt_states state; - enum bt_states last_state; /* assist printing and resets */ unsigned char seq; /* BT sequence number */ struct si_sm_io *io; - unsigned char write_data[IPMI_MAX_MSG_LENGTH]; - int write_count; - unsigned char read_data[IPMI_MAX_MSG_LENGTH]; - int read_count; - int truncated; - long timeout; - unsigned int error_retries; /* end of "common" fields */ + unsigned char write_data[IPMI_MAX_MSG_LENGTH]; + int write_count; + unsigned char read_data[IPMI_MAX_MSG_LENGTH]; + int read_count; + int truncated; + long timeout; /* microseconds countdown */ + int error_retries; /* end of "common" fields */ int nonzero_status; /* hung BMCs stay all 0 */ + enum bt_states complete; /* to divert the state machine */ + int BT_CAP_outreqs; + long BT_CAP_req2rsp; + int BT_CAP_retries; /* Recommended retries */ }; #define BT_CLR_WR_PTR 0x01 /* See IPMI 1.5 table 11.6.4 */ @@ -111,86 +131,118 @@ struct si_sm_data { static char *state2txt(unsigned char state) { switch (state) { - case BT_STATE_IDLE: return("IDLE"); - case BT_STATE_XACTION_START: return("XACTION"); - case BT_STATE_WRITE_BYTES: return("WR_BYTES"); - case BT_STATE_WRITE_END: return("WR_END"); - case BT_STATE_WRITE_CONSUME: return("WR_CONSUME"); - case BT_STATE_B2H_WAIT: return("B2H_WAIT"); - case BT_STATE_READ_END: return("RD_END"); - case BT_STATE_RESET1: return("RESET1"); - case BT_STATE_RESET2: return("RESET2"); - case BT_STATE_RESET3: return("RESET3"); - case BT_STATE_RESTART: return("RESTART"); - case BT_STATE_HOSED: return("HOSED"); + case BT_STATE_IDLE: return("IDLE"); + case BT_STATE_XACTION_START: return("XACTION"); + case BT_STATE_WRITE_BYTES: return("WR_BYTES"); + case BT_STATE_WRITE_CONSUME: return("WR_CONSUME"); + case BT_STATE_READ_WAIT: return("RD_WAIT"); + case BT_STATE_CLEAR_B2H: return("CLEAR_B2H"); + case BT_STATE_READ_BYTES: return("RD_BYTES"); + case BT_STATE_RESET1: return("RESET1"); + case BT_STATE_RESET2: return("RESET2"); + case BT_STATE_RESET3: return("RESET3"); + case BT_STATE_RESTART: return("RESTART"); + case BT_STATE_LONG_BUSY: return("LONG_BUSY"); + case BT_STATE_CAPABILITIES_BEGIN: return("CAP_BEGIN"); + case BT_STATE_CAPABILITIES_END: return("CAP_END"); } return("BAD STATE"); } #define STATE2TXT state2txt(bt->state) -static char *status2txt(unsigned char status, char *buf) +static char *status2txt(unsigned char status) { + /* + * This cannot be called by two threads at the same time and + * the buffer is always consumed immediately, so the static is + * safe to use. + */ + static char buf[40]; + strcpy(buf, "[ "); - if (status & BT_B_BUSY) strcat(buf, "B_BUSY "); - if (status & BT_H_BUSY) strcat(buf, "H_BUSY "); - if (status & BT_OEM0) strcat(buf, "OEM0 "); - if (status & BT_SMS_ATN) strcat(buf, "SMS "); - if (status & BT_B2H_ATN) strcat(buf, "B2H "); - if (status & BT_H2B_ATN) strcat(buf, "H2B "); + if (status & BT_B_BUSY) + strcat(buf, "B_BUSY "); + if (status & BT_H_BUSY) + strcat(buf, "H_BUSY "); + if (status & BT_OEM0) + strcat(buf, "OEM0 "); + if (status & BT_SMS_ATN) + strcat(buf, "SMS "); + if (status & BT_B2H_ATN) + strcat(buf, "B2H "); + if (status & BT_H2B_ATN) + strcat(buf, "H2B "); strcat(buf, "]"); return buf; } -#define STATUS2TXT(buf) status2txt(status, buf) +#define STATUS2TXT status2txt(status) + +/* called externally at insmod time, and internally on cleanup */ -/* This will be called from within this module on a hosed condition */ -#define FIRST_SEQ 0 static unsigned int bt_init_data(struct si_sm_data *bt, struct si_sm_io *io) { - bt->state = BT_STATE_IDLE; - bt->last_state = BT_STATE_IDLE; - bt->seq = FIRST_SEQ; - bt->io = io; - bt->write_count = 0; - bt->read_count = 0; - bt->error_retries = 0; - bt->nonzero_status = 0; - bt->truncated = 0; - bt->timeout = BT_NORMAL_TIMEOUT; + memset(bt, 0, sizeof(struct si_sm_data)); + if (bt->io != io) { /* external: one-time only things */ + bt->io = io; + bt->seq = 0; + } + bt->state = BT_STATE_IDLE; /* start here */ + bt->complete = BT_STATE_IDLE; /* end here */ + bt->BT_CAP_req2rsp = BT_NORMAL_TIMEOUT * 1000000; + bt->BT_CAP_retries = BT_NORMAL_RETRY_LIMIT; + /* BT_CAP_outreqs == zero is a flag to read BT Capabilities */ return 3; /* We claim 3 bytes of space; ought to check SPMI table */ } +/* Jam a completion code (probably an error) into a response */ + +static void force_result(struct si_sm_data *bt, unsigned char completion_code) +{ + bt->read_data[0] = 4; /* # following bytes */ + bt->read_data[1] = bt->write_data[1] | 4; /* Odd NetFn/LUN */ + bt->read_data[2] = bt->write_data[2]; /* seq (ignored) */ + bt->read_data[3] = bt->write_data[3]; /* Command */ + bt->read_data[4] = completion_code; + bt->read_count = 5; +} + +/* The upper state machine starts here */ + static int bt_start_transaction(struct si_sm_data *bt, unsigned char *data, unsigned int size) { unsigned int i; - if ((size < 2) || (size > (IPMI_MAX_MSG_LENGTH - 2))) - return -1; + if (size < 2) + return IPMI_REQ_LEN_INVALID_ERR; + if (size > IPMI_MAX_MSG_LENGTH) + return IPMI_REQ_LEN_EXCEEDED_ERR; + + if (bt->state == BT_STATE_LONG_BUSY) + return IPMI_NODE_BUSY_ERR; - if ((bt->state != BT_STATE_IDLE) && (bt->state != BT_STATE_HOSED)) - return -2; + if (bt->state != BT_STATE_IDLE) + return IPMI_NOT_IN_MY_STATE_ERR; if (bt_debug & BT_DEBUG_MSG) { - printk(KERN_WARNING "+++++++++++++++++++++++++++++++++++++\n"); - printk(KERN_WARNING "BT: write seq=0x%02X:", bt->seq); + printk(KERN_WARNING "BT: +++++++++++++++++ New command\n"); + printk(KERN_WARNING "BT: NetFn/LUN CMD [%d data]:", size - 2); for (i = 0; i < size; i ++) - printk (" %02x", data[i]); + printk (" %02x", data[i]); printk("\n"); } bt->write_data[0] = size + 1; /* all data plus seq byte */ bt->write_data[1] = *data; /* NetFn/LUN */ - bt->write_data[2] = bt->seq; + bt->write_data[2] = bt->seq++; memcpy(bt->write_data + 3, data + 1, size - 1); bt->write_count = size + 2; - bt->error_retries = 0; bt->nonzero_status = 0; - bt->read_count = 0; bt->truncated = 0; bt->state = BT_STATE_XACTION_START; - bt->last_state = BT_STATE_IDLE; - bt->timeout = BT_NORMAL_TIMEOUT; + bt->timeout = bt->BT_CAP_req2rsp; + force_result(bt, IPMI_ERR_UNSPECIFIED); return 0; } @@ -198,38 +250,30 @@ static int bt_start_transaction(struct s it calls this. Strip out the length and seq bytes. */ static int bt_get_result(struct si_sm_data *bt, - unsigned char *data, - unsigned int length) + unsigned char *data, + unsigned int length) { int i, msg_len; msg_len = bt->read_count - 2; /* account for length & seq */ - /* Always NetFn, Cmd, cCode */ if (msg_len < 3 || msg_len > IPMI_MAX_MSG_LENGTH) { - printk(KERN_DEBUG "BT results: bad msg_len = %d\n", msg_len); - data[0] = bt->write_data[1] | 0x4; /* Kludge a response */ - data[1] = bt->write_data[3]; - data[2] = IPMI_ERR_UNSPECIFIED; + force_result(bt, IPMI_ERR_UNSPECIFIED); msg_len = 3; - } else { - data[0] = bt->read_data[1]; - data[1] = bt->read_data[3]; - if (length < msg_len) - bt->truncated = 1; - if (bt->truncated) { /* can be set in read_all_bytes() */ - data[2] = IPMI_ERR_MSG_TRUNCATED; - msg_len = 3; - } else - memcpy(data + 2, bt->read_data + 4, msg_len - 2); + } + data[0] = bt->read_data[1]; + data[1] = bt->read_data[3]; + if (length < msg_len || bt->truncated) { + data[2] = IPMI_ERR_MSG_TRUNCATED; + msg_len = 3; + } else + memcpy(data + 2, bt->read_data + 4, msg_len - 2); - if (bt_debug & BT_DEBUG_MSG) { - printk (KERN_WARNING "BT: res (raw)"); - for (i = 0; i < msg_len; i++) - printk(" %02x", data[i]); - printk ("\n"); - } + if (bt_debug & BT_DEBUG_MSG) { + printk (KERN_WARNING "BT: result %d bytes:", msg_len); + for (i = 0; i < msg_len; i++) + printk(" %02x", data[i]); + printk ("\n"); } - bt->read_count = 0; /* paranoia */ return msg_len; } @@ -238,22 +282,40 @@ static int bt_get_result(struct si_sm_da static void reset_flags(struct si_sm_data *bt) { + if (bt_debug) + printk(KERN_WARNING "IPMI BT: flag reset %s\n", + status2txt(BT_STATUS)); if (BT_STATUS & BT_H_BUSY) - BT_CONTROL(BT_H_BUSY); - if (BT_STATUS & BT_B_BUSY) - BT_CONTROL(BT_B_BUSY); - BT_CONTROL(BT_CLR_WR_PTR); - BT_CONTROL(BT_SMS_ATN); - - if (BT_STATUS & BT_B2H_ATN) { - int i; - BT_CONTROL(BT_H_BUSY); - BT_CONTROL(BT_B2H_ATN); - BT_CONTROL(BT_CLR_RD_PTR); - for (i = 0; i < IPMI_MAX_MSG_LENGTH + 2; i++) - BMC2HOST; - BT_CONTROL(BT_H_BUSY); - } + BT_CONTROL(BT_H_BUSY); /* force clear */ + BT_CONTROL(BT_CLR_WR_PTR); /* always reset */ + BT_CONTROL(BT_SMS_ATN); /* always clear */ + BT_INTMASK_W(BT_BMC_HWRST); +} + +/* Get rid of an unwanted/stale response. This should only be needed for + BMCs that support multiple outstanding requests. */ + +static void drain_BMC2HOST(struct si_sm_data *bt) +{ + int i, size; + + if (!(BT_STATUS & BT_B2H_ATN)) /* Not signalling a response */ + return; + + BT_CONTROL(BT_H_BUSY); /* now set */ + BT_CONTROL(BT_B2H_ATN); /* always clear */ + BT_STATUS; /* pause */ + BT_CONTROL(BT_B2H_ATN); /* some BMCs are stubborn */ + BT_CONTROL(BT_CLR_RD_PTR); /* always reset */ + if (bt_debug) + printk(KERN_WARNING "IPMI BT: stale response %s; ", + status2txt(BT_STATUS)); + size = BMC2HOST; + for (i = 0; i < size ; i++) + BMC2HOST; + BT_CONTROL(BT_H_BUSY); /* now clear */ + if (bt_debug) + printk("drained %d bytes\n", size + 1); } static inline void write_all_bytes(struct si_sm_data *bt) @@ -261,201 +323,256 @@ static inline void write_all_bytes(struc int i; if (bt_debug & BT_DEBUG_MSG) { - printk(KERN_WARNING "BT: write %d bytes seq=0x%02X", + printk(KERN_WARNING "BT: write %d bytes seq=0x%02X", bt->write_count, bt->seq); for (i = 0; i < bt->write_count; i++) printk (" %02x", bt->write_data[i]); printk ("\n"); } for (i = 0; i < bt->write_count; i++) - HOST2BMC(bt->write_data[i]); + HOST2BMC(bt->write_data[i]); } static inline int read_all_bytes(struct si_sm_data *bt) { unsigned char i; + /* length is "framing info", minimum = 4: NetFn, Seq, Cmd, cCode. + Keep layout of first four bytes aligned with write_data[] */ + bt->read_data[0] = BMC2HOST; bt->read_count = bt->read_data[0]; - if (bt_debug & BT_DEBUG_MSG) - printk(KERN_WARNING "BT: read %d bytes:", bt->read_count); - /* minimum: length, NetFn, Seq, Cmd, cCode == 5 total, or 4 more - following the length byte. */ if (bt->read_count < 4 || bt->read_count >= IPMI_MAX_MSG_LENGTH) { if (bt_debug & BT_DEBUG_MSG) - printk("bad length %d\n", bt->read_count); + printk(KERN_WARNING "BT: bad raw rsp len=%d\n", + bt->read_count); bt->truncated = 1; return 1; /* let next XACTION START clean it up */ } for (i = 1; i <= bt->read_count; i++) - bt->read_data[i] = BMC2HOST; - bt->read_count++; /* account for the length byte */ + bt->read_data[i] = BMC2HOST; + bt->read_count++; /* Account internally for length byte */ if (bt_debug & BT_DEBUG_MSG) { - for (i = 0; i < bt->read_count; i++) + int max = bt->read_count; + + printk(KERN_WARNING "BT: got %d bytes seq=0x%02X", + max, bt->read_data[2]); + if (max > 16) + max = 16; + for (i = 0; i < max; i++) printk (" %02x", bt->read_data[i]); - printk ("\n"); + printk ("%s\n", bt->read_count == max ? "" : " ..."); } - if (bt->seq != bt->write_data[2]) /* idiot check */ - printk(KERN_DEBUG "BT: internal error: sequence mismatch\n"); - /* per the spec, the (NetFn, Seq, Cmd) tuples should match */ - if ((bt->read_data[3] == bt->write_data[3]) && /* Cmd */ - (bt->read_data[2] == bt->write_data[2]) && /* Sequence */ - ((bt->read_data[1] & 0xF8) == (bt->write_data[1] & 0xF8))) + /* per the spec, the (NetFn[1], Seq[2], Cmd[3]) tuples must match */ + if ((bt->read_data[3] == bt->write_data[3]) && + (bt->read_data[2] == bt->write_data[2]) && + ((bt->read_data[1] & 0xF8) == (bt->write_data[1] & 0xF8))) return 1; if (bt_debug & BT_DEBUG_MSG) - printk(KERN_WARNING "BT: bad packet: " + printk(KERN_WARNING "IPMI BT: bad packet: " "want 0x(%02X, %02X, %02X) got (%02X, %02X, %02X)\n", - bt->write_data[1], bt->write_data[2], bt->write_data[3], + bt->write_data[1] | 0x04, bt->write_data[2], bt->write_data[3], bt->read_data[1], bt->read_data[2], bt->read_data[3]); return 0; } -/* Modifies bt->state appropriately, need to get into the bt_event() switch */ +/* Restart if retries are left, or return an error completion code */ -static void error_recovery(struct si_sm_data *bt, char *reason) +static enum si_sm_result error_recovery(struct si_sm_data *bt, + unsigned char status, + unsigned char cCode) { - unsigned char status; - char buf[40]; /* For getting status */ + char *reason; - bt->timeout = BT_NORMAL_TIMEOUT; /* various places want to retry */ + bt->timeout = bt->BT_CAP_req2rsp; - status = BT_STATUS; - printk(KERN_DEBUG "BT: %s in %s %s\n", reason, STATE2TXT, - STATUS2TXT(buf)); + switch (cCode) { + case IPMI_TIMEOUT_ERR: + reason = "timeout"; + break; + default: + reason = "internal error"; + break; + } + + printk(KERN_WARNING "IPMI BT: %s in %s %s ", /* open-ended line */ + reason, STATE2TXT, STATUS2TXT); + /* Per the IPMI spec, retries are based on the sequence number + known only to this module, so manage a restart here. */ (bt->error_retries)++; - if (bt->error_retries > BT_RETRY_LIMIT) { - printk(KERN_DEBUG "retry limit (%d) exceeded\n", BT_RETRY_LIMIT); - bt->state = BT_STATE_HOSED; - if (!bt->nonzero_status) - printk(KERN_ERR "IPMI: BT stuck, try power cycle\n"); - else if (bt->error_retries <= BT_RETRY_LIMIT + 1) { - printk(KERN_DEBUG "IPMI: BT reset (takes 5 secs)\n"); - bt->state = BT_STATE_RESET1; - } - return; + if (bt->error_retries < bt->BT_CAP_retries) { + printk("%d retries left\n", + bt->BT_CAP_retries - bt->error_retries); + bt->state = BT_STATE_RESTART; + return SI_SM_CALL_WITHOUT_DELAY; } - /* Sometimes the BMC queues get in an "off-by-one" state...*/ - if ((bt->state == BT_STATE_B2H_WAIT) && (status & BT_B2H_ATN)) { - printk(KERN_DEBUG "retry B2H_WAIT\n"); - return; + printk("failed %d retries, sending error response\n", + bt->BT_CAP_retries); + if (!bt->nonzero_status) + printk(KERN_ERR "IPMI BT: stuck, try power cycle\n"); + + /* this is most likely during insmod */ + else if (bt->seq <= (unsigned char)(bt->BT_CAP_retries & 0xFF)) { + printk(KERN_WARNING "IPMI: BT reset (takes 5 secs)\n"); + bt->state = BT_STATE_RESET1; + return SI_SM_CALL_WITHOUT_DELAY; } - printk(KERN_DEBUG "restart command\n"); - bt->state = BT_STATE_RESTART; + /* Concoct a useful error message, set up the next state, and + be done with this sequence. */ + + bt->state = BT_STATE_IDLE; + switch (cCode) { + case IPMI_TIMEOUT_ERR: + if (status & BT_B_BUSY) { + cCode = IPMI_NODE_BUSY_ERR; + bt->state = BT_STATE_LONG_BUSY; + } + break; + default: + break; + } + force_result(bt, cCode); + return SI_SM_TRANSACTION_COMPLETE; } -/* Check the status and (possibly) advance the BT state machine. The - default return is SI_SM_CALL_WITH_DELAY. */ +/* Check status and (usually) take action and change this state machine. */ static enum si_sm_result bt_event(struct si_sm_data *bt, long time) { - unsigned char status; - char buf[40]; /* For getting status */ + unsigned char status, BT_CAP[8]; + static enum bt_states last_printed = BT_STATE_PRINTME; int i; status = BT_STATUS; bt->nonzero_status |= status; - - if ((bt_debug & BT_DEBUG_STATES) && (bt->state != bt->last_state)) + if ((bt_debug & BT_DEBUG_STATES) && (bt->state != last_printed)) { printk(KERN_WARNING "BT: %s %s TO=%ld - %ld \n", STATE2TXT, - STATUS2TXT(buf), + STATUS2TXT, bt->timeout, time); - bt->last_state = bt->state; + last_printed = bt->state; + } - if (bt->state == BT_STATE_HOSED) - return SI_SM_HOSED; + /* Commands that time out may still (eventually) provide a response. + This stale response will get in the way of a new response so remove + it if possible (hopefully during IDLE). Even if it comes up later + it will be rejected by its (now-forgotten) seq number. */ + + if ((bt->state < BT_STATE_WRITE_BYTES) && (status & BT_B2H_ATN)) { + drain_BMC2HOST(bt); + BT_SI_SM_RETURN(SI_SM_CALL_WITH_DELAY); + } - if (bt->state != BT_STATE_IDLE) { /* do timeout test */ + if ((bt->state != BT_STATE_IDLE) && + (bt->state < BT_STATE_PRINTME)) { /* check timeout */ bt->timeout -= time; - if ((bt->timeout < 0) && (bt->state < BT_STATE_RESET1)) { - error_recovery(bt, "timed out"); - return SI_SM_CALL_WITHOUT_DELAY; - } + if ((bt->timeout < 0) && (bt->state < BT_STATE_RESET1)) + return error_recovery(bt, + status, + IPMI_TIMEOUT_ERR); } switch (bt->state) { - case BT_STATE_IDLE: /* check for asynchronous messages */ + /* Idle state first checks for asynchronous messages from another + channel, then does some opportunistic housekeeping. */ + + case BT_STATE_IDLE: if (status & BT_SMS_ATN) { BT_CONTROL(BT_SMS_ATN); /* clear it */ return SI_SM_ATTN; } - return SI_SM_IDLE; - case BT_STATE_XACTION_START: - if (status & BT_H_BUSY) { + if (status & BT_H_BUSY) /* clear a leftover H_BUSY */ BT_CONTROL(BT_H_BUSY); - break; - } - if (status & BT_B2H_ATN) - break; - bt->state = BT_STATE_WRITE_BYTES; - return SI_SM_CALL_WITHOUT_DELAY; /* for logging */ - case BT_STATE_WRITE_BYTES: + /* Read BT capabilities if it hasn't been done yet */ + if (!bt->BT_CAP_outreqs) + BT_STATE_CHANGE(BT_STATE_CAPABILITIES_BEGIN, + SI_SM_CALL_WITHOUT_DELAY); + bt->timeout = bt->BT_CAP_req2rsp; + BT_SI_SM_RETURN(SI_SM_IDLE); + + case BT_STATE_XACTION_START: if (status & (BT_B_BUSY | BT_H2B_ATN)) - break; + BT_SI_SM_RETURN(SI_SM_CALL_WITH_DELAY); + if (BT_STATUS & BT_H_BUSY) + BT_CONTROL(BT_H_BUSY); /* force clear */ + BT_STATE_CHANGE(BT_STATE_WRITE_BYTES, + SI_SM_CALL_WITHOUT_DELAY); + + case BT_STATE_WRITE_BYTES: + if (status & BT_H_BUSY) + BT_CONTROL(BT_H_BUSY); /* clear */ BT_CONTROL(BT_CLR_WR_PTR); write_all_bytes(bt); - BT_CONTROL(BT_H2B_ATN); /* clears too fast to catch? */ - bt->state = BT_STATE_WRITE_CONSUME; - return SI_SM_CALL_WITHOUT_DELAY; /* it MIGHT sail through */ - - case BT_STATE_WRITE_CONSUME: /* BMCs usually blow right thru here */ - if (status & (BT_H2B_ATN | BT_B_BUSY)) - break; - bt->state = BT_STATE_B2H_WAIT; - /* fall through with status */ - - /* Stay in BT_STATE_B2H_WAIT until a packet matches. However, spinning - hard here, constantly reading status, seems to hold off the - generation of B2H_ATN so ALWAYS return CALL_WITH_DELAY. */ - - case BT_STATE_B2H_WAIT: - if (!(status & BT_B2H_ATN)) - break; - - /* Assume ordered, uncached writes: no need to wait */ - if (!(status & BT_H_BUSY)) - BT_CONTROL(BT_H_BUSY); /* set */ - BT_CONTROL(BT_B2H_ATN); /* clear it, ACK to the BMC */ - BT_CONTROL(BT_CLR_RD_PTR); /* reset the queue */ - i = read_all_bytes(bt); - BT_CONTROL(BT_H_BUSY); /* clear */ - if (!i) /* Try this state again */ - break; - bt->state = BT_STATE_READ_END; - return SI_SM_CALL_WITHOUT_DELAY; /* for logging */ - - case BT_STATE_READ_END: - - /* I could wait on BT_H_BUSY to go clear for a truly clean - exit. However, this is already done in XACTION_START - and the (possible) extra loop/status/possible wait affects - performance. So, as long as it works, just ignore H_BUSY */ + BT_CONTROL(BT_H2B_ATN); /* can clear too fast to catch */ + BT_STATE_CHANGE(BT_STATE_WRITE_CONSUME, + SI_SM_CALL_WITHOUT_DELAY); -#ifdef MAKE_THIS_TRUE_IF_NECESSARY + case BT_STATE_WRITE_CONSUME: + if (status & (BT_B_BUSY | BT_H2B_ATN)) + BT_SI_SM_RETURN(SI_SM_CALL_WITH_DELAY); + BT_STATE_CHANGE(BT_STATE_READ_WAIT, + SI_SM_CALL_WITHOUT_DELAY); + + /* Spinning hard can suppress B2H_ATN and force a timeout */ + + case BT_STATE_READ_WAIT: + if (!(status & BT_B2H_ATN)) + BT_SI_SM_RETURN(SI_SM_CALL_WITH_DELAY); + BT_CONTROL(BT_H_BUSY); /* set */ + + /* Uncached, ordered writes should just proceeed serially but + some BMCs don't clear B2H_ATN with one hit. Fast-path a + workaround without too much penalty to the general case. */ + + BT_CONTROL(BT_B2H_ATN); /* clear it to ACK the BMC */ + BT_STATE_CHANGE(BT_STATE_CLEAR_B2H, + SI_SM_CALL_WITHOUT_DELAY); + + case BT_STATE_CLEAR_B2H: + if (status & BT_B2H_ATN) { /* keep hitting it */ + BT_CONTROL(BT_B2H_ATN); + BT_SI_SM_RETURN(SI_SM_CALL_WITH_DELAY); + } + BT_STATE_CHANGE(BT_STATE_READ_BYTES, + SI_SM_CALL_WITHOUT_DELAY); - if (status & BT_H_BUSY) - break; -#endif - bt->seq++; - bt->state = BT_STATE_IDLE; - return SI_SM_TRANSACTION_COMPLETE; + case BT_STATE_READ_BYTES: + if (!(status & BT_H_BUSY)) /* check in case of retry */ + BT_CONTROL(BT_H_BUSY); + BT_CONTROL(BT_CLR_RD_PTR); /* start of BMC2HOST buffer */ + i = read_all_bytes(bt); /* true == packet seq match */ + BT_CONTROL(BT_H_BUSY); /* NOW clear */ + if (!i) /* Not my message */ + BT_STATE_CHANGE(BT_STATE_READ_WAIT, + SI_SM_CALL_WITHOUT_DELAY); + bt->state = bt->complete; + return bt->state == BT_STATE_IDLE ? /* where to next? */ + SI_SM_TRANSACTION_COMPLETE : /* normal */ + SI_SM_CALL_WITHOUT_DELAY; /* Startup magic */ + + case BT_STATE_LONG_BUSY: /* For example: after FW update */ + if (!(status & BT_B_BUSY)) { + reset_flags(bt); /* next state is now IDLE */ + bt_init_data(bt, bt->io); + } + return SI_SM_CALL_WITH_DELAY; /* No repeat printing */ case BT_STATE_RESET1: - reset_flags(bt); - bt->timeout = BT_RESET_DELAY; - bt->state = BT_STATE_RESET2; - break; + reset_flags(bt); + drain_BMC2HOST(bt); + BT_STATE_CHANGE(BT_STATE_RESET2, + SI_SM_CALL_WITH_DELAY); case BT_STATE_RESET2: /* Send a soft reset */ BT_CONTROL(BT_CLR_WR_PTR); @@ -464,29 +581,59 @@ static enum si_sm_result bt_event(struct HOST2BMC(42); /* Sequence number */ HOST2BMC(3); /* Cmd == Soft reset */ BT_CONTROL(BT_H2B_ATN); - bt->state = BT_STATE_RESET3; - break; + bt->timeout = BT_RESET_DELAY * 1000000; + BT_STATE_CHANGE(BT_STATE_RESET3, + SI_SM_CALL_WITH_DELAY); - case BT_STATE_RESET3: + case BT_STATE_RESET3: /* Hold off everything for a bit */ if (bt->timeout > 0) - return SI_SM_CALL_WITH_DELAY; - bt->state = BT_STATE_RESTART; /* printk in debug modes */ - break; + return SI_SM_CALL_WITH_DELAY; + drain_BMC2HOST(bt); + BT_STATE_CHANGE(BT_STATE_RESTART, + SI_SM_CALL_WITH_DELAY); - case BT_STATE_RESTART: /* don't reset retries! */ - reset_flags(bt); - bt->write_data[2] = ++bt->seq; + case BT_STATE_RESTART: /* don't reset retries or seq! */ bt->read_count = 0; bt->nonzero_status = 0; - bt->timeout = BT_NORMAL_TIMEOUT; - bt->state = BT_STATE_XACTION_START; - break; - - default: /* HOSED is supposed to be caught much earlier */ - error_recovery(bt, "internal logic error"); - break; - } - return SI_SM_CALL_WITH_DELAY; + bt->timeout = bt->BT_CAP_req2rsp; + BT_STATE_CHANGE(BT_STATE_XACTION_START, + SI_SM_CALL_WITH_DELAY); + + /* Get BT Capabilities, using timing of upper level state machine. + Set outreqs to prevent infinite loop on timeout. */ + case BT_STATE_CAPABILITIES_BEGIN: + bt->BT_CAP_outreqs = 1; + { + unsigned char GetBT_CAP[] = { 0x18, 0x36 }; + bt->state = BT_STATE_IDLE; + bt_start_transaction(bt, GetBT_CAP, sizeof(GetBT_CAP)); + } + bt->complete = BT_STATE_CAPABILITIES_END; + BT_STATE_CHANGE(BT_STATE_XACTION_START, + SI_SM_CALL_WITH_DELAY); + + case BT_STATE_CAPABILITIES_END: + i = bt_get_result(bt, BT_CAP, sizeof(BT_CAP)); + bt_init_data(bt, bt->io); + if ((i == 8) && !BT_CAP[2]) { + bt->BT_CAP_outreqs = BT_CAP[3]; + bt->BT_CAP_req2rsp = BT_CAP[6] * 1000000; + bt->BT_CAP_retries = BT_CAP[7]; + } else + printk(KERN_WARNING "IPMI BT: using default values\n"); + if (!bt->BT_CAP_outreqs) + bt->BT_CAP_outreqs = 1; + printk(KERN_WARNING "IPMI BT: req2rsp=%ld secs retries=%d\n", + bt->BT_CAP_req2rsp / 1000000L, bt->BT_CAP_retries); + bt->timeout = bt->BT_CAP_req2rsp; + return SI_SM_CALL_WITHOUT_DELAY; + + default: /* should never occur */ + return error_recovery(bt, + status, + IPMI_ERR_UNSPECIFIED); + } + return SI_SM_CALL_WITH_DELAY; } static int bt_detect(struct si_sm_data *bt) @@ -497,7 +644,7 @@ static int bt_detect(struct si_sm_data * test that first. The calling routine uses negative logic. */ if ((BT_STATUS == 0xFF) && (BT_INTMASK_R == 0xFF)) - return 1; + return 1; reset_flags(bt); return 0; } @@ -513,11 +660,11 @@ static int bt_size(void) struct si_sm_handlers bt_smi_handlers = { - .init_data = bt_init_data, - .start_transaction = bt_start_transaction, - .get_result = bt_get_result, - .event = bt_event, - .detect = bt_detect, - .cleanup = bt_cleanup, - .size = bt_size, + .init_data = bt_init_data, + .start_transaction = bt_start_transaction, + .get_result = bt_get_result, + .event = bt_event, + .detect = bt_detect, + .cleanup = bt_cleanup, + .size = bt_size, }; diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_devintf.c linux-ipmi/drivers/char/ipmi/ipmi_devintf.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_devintf.c 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_devintf.c 2007-06-11 15:23:49.000000000 -0400 @@ -377,7 +377,8 @@ static int ipmi_ioctl(struct inode *ino break; } - rv = ipmi_register_for_cmd(priv->user, val.netfn, val.cmd); + rv = ipmi_register_for_cmd(priv->user, val.netfn, val.cmd, + IPMI_CHAN_ALL); break; } @@ -390,7 +391,36 @@ static int ipmi_ioctl(struct inode *ino break; } - rv = ipmi_unregister_for_cmd(priv->user, val.netfn, val.cmd); + rv = ipmi_unregister_for_cmd(priv->user, val.netfn, val.cmd, + IPMI_CHAN_ALL); + break; + } + + case IPMICTL_REGISTER_FOR_CMD_CHANS: + { + struct ipmi_cmdspec_chans val; + + if (copy_from_user(&val, arg, sizeof(val))) { + rv = -EFAULT; + break; + } + + rv = ipmi_register_for_cmd(priv->user, val.netfn, val.cmd, + val.chans); + break; + } + + case IPMICTL_UNREGISTER_FOR_CMD_CHANS: + { + struct ipmi_cmdspec_chans val; + + if (copy_from_user(&val, arg, sizeof(val))) { + rv = -EFAULT; + break; + } + + rv = ipmi_unregister_for_cmd(priv->user, val.netfn, val.cmd, + val.chans); break; } @@ -566,6 +596,31 @@ static int ipmi_ioctl(struct inode *ino rv = 0; break; } + + case IPMICTL_GET_MAINTENANCE_MODE_CMD: + { + int mode; + + mode = ipmi_get_maintenance_mode(priv->user); + if (copy_to_user(arg, &mode, sizeof(mode))) { + rv = -EFAULT; + break; + } + rv = 0; + break; + } + + case IPMICTL_SET_MAINTENANCE_MODE_CMD: + { + int mode; + + if (copy_from_user(&mode, arg, sizeof(mode))) { + rv = -EFAULT; + break; + } + rv = ipmi_set_maintenance_mode(priv->user, mode); + break; + } } return rv; @@ -779,7 +834,7 @@ static const struct file_operations ipmi #define DEVICE_NAME "ipmidev" -static int ipmi_major = 0; +static int ipmi_major; module_param(ipmi_major, int, 0); MODULE_PARM_DESC(ipmi_major, "Sets the major number of the IPMI device. By" " default, or if you set it to zero, it will choose the next" diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_kcs_sm.c linux-ipmi/drivers/char/ipmi/ipmi_kcs_sm.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_kcs_sm.c 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_kcs_sm.c 2007-06-11 15:23:49.000000000 -0400 @@ -93,8 +93,8 @@ enum kcs_states { state machine. */ }; -#define MAX_KCS_READ_SIZE 80 -#define MAX_KCS_WRITE_SIZE 80 +#define MAX_KCS_READ_SIZE IPMI_MAX_MSG_LENGTH +#define MAX_KCS_WRITE_SIZE IPMI_MAX_MSG_LENGTH /* Timeouts in microseconds. */ #define IBF_RETRY_TIMEOUT 1000000 @@ -261,12 +261,14 @@ static int start_kcs_transaction(struct { unsigned int i; - if ((size < 2) || (size > MAX_KCS_WRITE_SIZE)) { - return -1; - } - if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) { - return -2; - } + if (size < 2) + return IPMI_REQ_LEN_INVALID_ERR; + if (size > MAX_KCS_WRITE_SIZE) + return IPMI_REQ_LEN_EXCEEDED_ERR; + + if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) + return IPMI_NOT_IN_MY_STATE_ERR; + if (kcs_debug & KCS_DEBUG_MSG) { printk(KERN_DEBUG "start_kcs_transaction -"); for (i = 0; i < size; i ++) { diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_msghandler.c linux-ipmi/drivers/char/ipmi/ipmi_msghandler.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_msghandler.c 2007-06-11 15:26:45.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_msghandler.c 2007-06-11 15:23:49.000000000 -0400 @@ -48,17 +48,20 @@ #define PFX "IPMI message handler: " -#define IPMI_DRIVER_VERSION "39.0" +#define IPMI_DRIVER_VERSION "39.1" static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void); static int ipmi_init_msghandler(void); -static int initialized = 0; +static int initialized; #ifdef CONFIG_PROC_FS -static struct proc_dir_entry *proc_ipmi_root = NULL; +static struct proc_dir_entry *proc_ipmi_root; #endif /* CONFIG_PROC_FS */ +/* Remain in auto-maintenance mode for this amount of time (in ms). */ +#define IPMI_MAINTENANCE_MODE_TIMEOUT 30000 + #define MAX_EVENTS_IN_QUEUE 25 /* Don't let a message sit in a queue forever, always time it with at lest @@ -96,6 +99,7 @@ struct cmd_rcvr ipmi_user_t user; unsigned char netfn; unsigned char cmd; + unsigned int chans; /* * This is used to form a linked lised during mass deletion. @@ -192,10 +196,17 @@ struct ipmi_smi struct kref refcount; + /* Used for a list of interfaces. */ + struct list_head link; + /* The list of upper layers that are using me. seq_lock * protects this. */ struct list_head users; + /* Information to supply to users. */ + unsigned char ipmi_version_major; + unsigned char ipmi_version_minor; + /* Used for wake ups at startup. */ wait_queue_head_t waitq; @@ -203,7 +214,10 @@ struct ipmi_smi char *my_dev_name; char *sysfs_name; - /* This is the lower-layer's sender routine. */ + /* This is the lower-layer's sender routine. Note that you + * must either be holding the ipmi_interfaces_mutex or be in + * an umpreemptible region to use this. You must fetch the + * value into a local variable and make sure it is not NULL. */ struct ipmi_smi_handlers *handlers; void *send_info; @@ -242,6 +256,7 @@ struct ipmi_smi spinlock_t events_lock; /* For dealing with event stuff. */ struct list_head waiting_events; unsigned int waiting_events_count; /* How many events in queue? */ + int delivering_events; /* The event receiver for my BMC, only really used at panic shutdown as a place to store this. */ @@ -250,6 +265,12 @@ struct ipmi_smi unsigned char local_sel_device; unsigned char local_event_generator; + /* For handling of maintenance mode. */ + int maintenance_mode; + int maintenance_mode_enable; + int auto_maintenance_timeout; + spinlock_t maintenance_mode_lock; /* Used in a timer... */ + /* A cheap hack, if this is non-null and a message to an interface comes in with a NULL user, call this routine with it. Note that the message will still be freed by the @@ -338,13 +359,6 @@ struct ipmi_smi }; #define to_si_intf_from_dev(device) container_of(device, struct ipmi_smi, dev) -/* Used to mark an interface entry that cannot be used but is not a - * free entry, either, primarily used at creation and deletion time so - * a slot doesn't get reused too quickly. */ -#define IPMI_INVALID_INTERFACE_ENTRY ((ipmi_smi_t) ((long) 1)) -#define IPMI_INVALID_INTERFACE(i) (((i) == NULL) \ - || (i == IPMI_INVALID_INTERFACE_ENTRY)) - /** * The driver model view of the IPMI messaging driver. */ @@ -354,16 +368,13 @@ static struct device_driver ipmidriver = }; static DEFINE_MUTEX(ipmidriver_mutex); -#define MAX_IPMI_INTERFACES 8 -static ipmi_smi_t ipmi_interfaces[MAX_IPMI_INTERFACES]; - -/* Directly protects the ipmi_interfaces data structure. */ -static DEFINE_SPINLOCK(interfaces_lock); +static struct list_head ipmi_interfaces = LIST_HEAD_INIT(ipmi_interfaces); +static DEFINE_MUTEX(ipmi_interfaces_mutex); /* List of watchers that want to know when smi's are added and deleted. */ static struct list_head smi_watchers = LIST_HEAD_INIT(smi_watchers); -static DECLARE_RWSEM(smi_watchers_sem); +static DEFINE_MUTEX(smi_watchers_mutex); static void free_recv_msg_list(struct list_head *q) @@ -376,22 +387,33 @@ static void free_recv_msg_list(struct li } } +static void free_smi_msg_list(struct list_head *q) +{ + struct ipmi_smi_msg *msg, *msg2; + + list_for_each_entry_safe(msg, msg2, q, link) { + list_del(&msg->link); + ipmi_free_smi_msg(msg); + } +} + static void clean_up_interface_data(ipmi_smi_t intf) { int i; struct cmd_rcvr *rcvr, *rcvr2; struct list_head list; - free_recv_msg_list(&intf->waiting_msgs); + free_smi_msg_list(&intf->waiting_msgs); free_recv_msg_list(&intf->waiting_events); - /* Wholesale remove all the entries from the list in the - * interface and wait for RCU to know that none are in use. */ + /* + * Wholesale remove all the entries from the list in the + * interface and wait for RCU to know that none are in use. + */ mutex_lock(&intf->cmd_rcvrs_mutex); - list_add_rcu(&list, &intf->cmd_rcvrs); - list_del_rcu(&intf->cmd_rcvrs); + INIT_LIST_HEAD(&list); + list_splice_init_rcu(&intf->cmd_rcvrs, &list, synchronize_rcu); mutex_unlock(&intf->cmd_rcvrs_mutex); - synchronize_rcu(); list_for_each_entry_safe(rcvr, rcvr2, &list, link) kfree(rcvr); @@ -413,48 +435,84 @@ static void intf_free(struct kref *ref) kfree(intf); } +struct watcher_entry { + int intf_num; + ipmi_smi_t intf; + struct list_head link; +}; + int ipmi_smi_watcher_register(struct ipmi_smi_watcher *watcher) { - int i; - unsigned long flags; + ipmi_smi_t intf; + struct list_head to_deliver = LIST_HEAD_INIT(to_deliver); + struct watcher_entry *e, *e2; + + mutex_lock(&smi_watchers_mutex); - down_write(&smi_watchers_sem); - list_add(&(watcher->link), &smi_watchers); - up_write(&smi_watchers_sem); - spin_lock_irqsave(&interfaces_lock, flags); - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - ipmi_smi_t intf = ipmi_interfaces[i]; - if (IPMI_INVALID_INTERFACE(intf)) + mutex_lock(&ipmi_interfaces_mutex); + + /* Build a list of things to deliver. */ + list_for_each_entry(intf, &ipmi_interfaces, link) { + if (intf->intf_num == -1) continue; - spin_unlock_irqrestore(&interfaces_lock, flags); - watcher->new_smi(i, intf->si_dev); - spin_lock_irqsave(&interfaces_lock, flags); + e = kmalloc(sizeof(*e), GFP_KERNEL); + if (!e) + goto out_err; + kref_get(&intf->refcount); + e->intf = intf; + e->intf_num = intf->intf_num; + list_add_tail(&e->link, &to_deliver); + } + + /* We will succeed, so add it to the list. */ + list_add(&watcher->link, &smi_watchers); + + mutex_unlock(&ipmi_interfaces_mutex); + + list_for_each_entry_safe(e, e2, &to_deliver, link) { + list_del(&e->link); + watcher->new_smi(e->intf_num, e->intf->si_dev); + kref_put(&e->intf->refcount, intf_free); + kfree(e); } - spin_unlock_irqrestore(&interfaces_lock, flags); + + mutex_unlock(&smi_watchers_mutex); + return 0; + +out_err: + mutex_unlock(&ipmi_interfaces_mutex); + mutex_unlock(&smi_watchers_mutex); + list_for_each_entry_safe(e, e2, &to_deliver, link) { + list_del(&e->link); + kref_put(&e->intf->refcount, intf_free); + kfree(e); + } + return -ENOMEM; } int ipmi_smi_watcher_unregister(struct ipmi_smi_watcher *watcher) { - down_write(&smi_watchers_sem); + mutex_lock(&smi_watchers_mutex); list_del(&(watcher->link)); - up_write(&smi_watchers_sem); + mutex_unlock(&smi_watchers_mutex); return 0; } +/* + * Must be called with smi_watchers_mutex held. + */ static void call_smi_watchers(int i, struct device *dev) { struct ipmi_smi_watcher *w; - down_read(&smi_watchers_sem); list_for_each_entry(w, &smi_watchers, link) { if (try_module_get(w->owner)) { w->new_smi(i, dev); module_put(w->owner); } } - up_read(&smi_watchers_sem); } static int @@ -580,6 +638,17 @@ static void deliver_response(struct ipmi } } +static void +deliver_err_response(struct ipmi_recv_msg *msg, int err) +{ + msg->recv_type = IPMI_RESPONSE_RECV_TYPE; + msg->msg_data[0] = err; + msg->msg.netfn |= 1; /* Convert to a response. */ + msg->msg.data_len = 1; + msg->msg.data = msg->msg_data; + deliver_response(msg); +} + /* Find the next sequence number not being used and add the given message with the given timeout to the sequence table. This must be called with the interface's seq_lock held. */ @@ -717,14 +786,8 @@ static int intf_err_seq(ipmi_smi_t int } spin_unlock_irqrestore(&(intf->seq_lock), flags); - if (msg) { - msg->recv_type = IPMI_RESPONSE_RECV_TYPE; - msg->msg_data[0] = err; - msg->msg.netfn |= 1; /* Convert to a response. */ - msg->msg.data_len = 1; - msg->msg.data = msg->msg_data; - deliver_response(msg); - } + if (msg) + deliver_err_response(msg, err); return rv; } @@ -766,17 +829,18 @@ int ipmi_create_user(unsigned int if (!new_user) return -ENOMEM; - spin_lock_irqsave(&interfaces_lock, flags); - intf = ipmi_interfaces[if_num]; - if ((if_num >= MAX_IPMI_INTERFACES) || IPMI_INVALID_INTERFACE(intf)) { - spin_unlock_irqrestore(&interfaces_lock, flags); - rv = -EINVAL; - goto out_kfree; - } + mutex_lock(&ipmi_interfaces_mutex); + list_for_each_entry_rcu(intf, &ipmi_interfaces, link) { + if (intf->intf_num == if_num) + goto found; + } + /* Not found, return an error */ + rv = -EINVAL; + goto out_kfree; + found: /* Note that each existing user holds a refcount to the interface. */ kref_get(&intf->refcount); - spin_unlock_irqrestore(&interfaces_lock, flags); kref_init(&new_user->refcount); new_user->handler = handler; @@ -797,6 +861,10 @@ int ipmi_create_user(unsigned int } } + /* Hold the lock so intf->handlers is guaranteed to be good + * until now */ + mutex_unlock(&ipmi_interfaces_mutex); + new_user->valid = 1; spin_lock_irqsave(&intf->seq_lock, flags); list_add_rcu(&new_user->link, &intf->users); @@ -807,6 +875,7 @@ int ipmi_create_user(unsigned int out_kref: kref_put(&intf->refcount, intf_free); out_kfree: + mutex_unlock(&ipmi_interfaces_mutex); kfree(new_user); return rv; } @@ -836,6 +905,7 @@ int ipmi_destroy_user(ipmi_user_t user) && (intf->seq_table[i].recv_msg->user == user)) { intf->seq_table[i].inuse = 0; + ipmi_free_recv_msg(intf->seq_table[i].recv_msg); } } spin_unlock_irqrestore(&intf->seq_lock, flags); @@ -862,9 +932,13 @@ int ipmi_destroy_user(ipmi_user_t user) kfree(rcvr); } - module_put(intf->handlers->owner); - if (intf->handlers->dec_usecount) - intf->handlers->dec_usecount(intf->send_info); + mutex_lock(&ipmi_interfaces_mutex); + if (intf->handlers) { + module_put(intf->handlers->owner); + if (intf->handlers->dec_usecount) + intf->handlers->dec_usecount(intf->send_info); + } + mutex_unlock(&ipmi_interfaces_mutex); kref_put(&intf->refcount, intf_free); @@ -877,8 +951,8 @@ void ipmi_get_version(ipmi_user_t user unsigned char *major, unsigned char *minor) { - *major = ipmi_version_major(&user->intf->bmc->id); - *minor = ipmi_version_minor(&user->intf->bmc->id); + *major = user->intf->ipmi_version_major; + *minor = user->intf->ipmi_version_minor; } int ipmi_set_my_address(ipmi_user_t user, @@ -921,6 +995,65 @@ int ipmi_get_my_LUN(ipmi_user_t user, return 0; } +int ipmi_get_maintenance_mode(ipmi_user_t user) +{ + int mode; + unsigned long flags; + + spin_lock_irqsave(&user->intf->maintenance_mode_lock, flags); + mode = user->intf->maintenance_mode; + spin_unlock_irqrestore(&user->intf->maintenance_mode_lock, flags); + + return mode; +} +EXPORT_SYMBOL(ipmi_get_maintenance_mode); + +static void maintenance_mode_update(ipmi_smi_t intf) +{ + if (intf->handlers->set_maintenance_mode) + intf->handlers->set_maintenance_mode( + intf->send_info, intf->maintenance_mode_enable); +} + +int ipmi_set_maintenance_mode(ipmi_user_t user, int mode) +{ + int rv = 0; + unsigned long flags; + ipmi_smi_t intf = user->intf; + + spin_lock_irqsave(&intf->maintenance_mode_lock, flags); + if (intf->maintenance_mode != mode) { + switch (mode) { + case IPMI_MAINTENANCE_MODE_AUTO: + intf->maintenance_mode = mode; + intf->maintenance_mode_enable + = (intf->auto_maintenance_timeout > 0); + break; + + case IPMI_MAINTENANCE_MODE_OFF: + intf->maintenance_mode = mode; + intf->maintenance_mode_enable = 0; + break; + + case IPMI_MAINTENANCE_MODE_ON: + intf->maintenance_mode = mode; + intf->maintenance_mode_enable = 1; + break; + + default: + rv = -EINVAL; + goto out_unlock; + } + + maintenance_mode_update(intf); + } + out_unlock: + spin_unlock_irqrestore(&intf->maintenance_mode_lock, flags); + + return rv; +} +EXPORT_SYMBOL(ipmi_set_maintenance_mode); + int ipmi_set_gets_events(ipmi_user_t user, int val) { unsigned long flags; @@ -933,20 +1066,33 @@ int ipmi_set_gets_events(ipmi_user_t use spin_lock_irqsave(&intf->events_lock, flags); user->gets_events = val; - if (val) { - /* Deliver any queued events. */ + if (intf->delivering_events) + /* + * Another thread is delivering events for this, so + * let it handle any new events. + */ + goto out; + + /* Deliver any queued events. */ + while (user->gets_events && !list_empty(&intf->waiting_events)) { list_for_each_entry_safe(msg, msg2, &intf->waiting_events, link) list_move_tail(&msg->link, &msgs); intf->waiting_events_count = 0; - } - /* Hold the events lock while doing this to preserve order. */ - list_for_each_entry_safe(msg, msg2, &msgs, link) { - msg->user = user; - kref_get(&user->refcount); - deliver_response(msg); + intf->delivering_events = 1; + spin_unlock_irqrestore(&intf->events_lock, flags); + + list_for_each_entry_safe(msg, msg2, &msgs, link) { + msg->user = user; + kref_get(&user->refcount); + deliver_response(msg); + } + + spin_lock_irqsave(&intf->events_lock, flags); + intf->delivering_events = 0; } + out: spin_unlock_irqrestore(&intf->events_lock, flags); return 0; @@ -954,24 +1100,41 @@ int ipmi_set_gets_events(ipmi_user_t use static struct cmd_rcvr *find_cmd_rcvr(ipmi_smi_t intf, unsigned char netfn, - unsigned char cmd) + unsigned char cmd, + unsigned char chan) { struct cmd_rcvr *rcvr; list_for_each_entry_rcu(rcvr, &intf->cmd_rcvrs, link) { - if ((rcvr->netfn == netfn) && (rcvr->cmd == cmd)) + if ((rcvr->netfn == netfn) && (rcvr->cmd == cmd) + && (rcvr->chans & (1 << chan))) return rcvr; } return NULL; } +static int is_cmd_rcvr_exclusive(ipmi_smi_t intf, + unsigned char netfn, + unsigned char cmd, + unsigned int chans) +{ + struct cmd_rcvr *rcvr; + + list_for_each_entry_rcu(rcvr, &intf->cmd_rcvrs, link) { + if ((rcvr->netfn == netfn) && (rcvr->cmd == cmd) + && (rcvr->chans & chans)) + return 0; + } + return 1; +} + int ipmi_register_for_cmd(ipmi_user_t user, unsigned char netfn, - unsigned char cmd) + unsigned char cmd, + unsigned int chans) { ipmi_smi_t intf = user->intf; struct cmd_rcvr *rcvr; - struct cmd_rcvr *entry; int rv = 0; @@ -980,12 +1143,12 @@ int ipmi_register_for_cmd(ipmi_user_t return -ENOMEM; rcvr->cmd = cmd; rcvr->netfn = netfn; + rcvr->chans = chans; rcvr->user = user; mutex_lock(&intf->cmd_rcvrs_mutex); /* Make sure the command/netfn is not already registered. */ - entry = find_cmd_rcvr(intf, netfn, cmd); - if (entry) { + if (!is_cmd_rcvr_exclusive(intf, netfn, cmd, chans)) { rv = -EBUSY; goto out_unlock; } @@ -1002,30 +1165,46 @@ int ipmi_register_for_cmd(ipmi_user_t int ipmi_unregister_for_cmd(ipmi_user_t user, unsigned char netfn, - unsigned char cmd) + unsigned char cmd, + unsigned int chans) { ipmi_smi_t intf = user->intf; struct cmd_rcvr *rcvr; + struct cmd_rcvr *rcvrs = NULL; + int i, rv = -ENOENT; mutex_lock(&intf->cmd_rcvrs_mutex); - /* Make sure the command/netfn is not already registered. */ - rcvr = find_cmd_rcvr(intf, netfn, cmd); - if ((rcvr) && (rcvr->user == user)) { - list_del_rcu(&rcvr->link); - mutex_unlock(&intf->cmd_rcvrs_mutex); - synchronize_rcu(); + for (i = 0; i < IPMI_NUM_CHANNELS; i++) { + if (((1 << i) & chans) == 0) + continue; + rcvr = find_cmd_rcvr(intf, netfn, cmd, i); + if (rcvr == NULL) + continue; + if (rcvr->user == user) { + rv = 0; + rcvr->chans &= ~chans; + if (rcvr->chans == 0) { + list_del_rcu(&rcvr->link); + rcvr->next = rcvrs; + rcvrs = rcvr; + } + } + } + mutex_unlock(&intf->cmd_rcvrs_mutex); + synchronize_rcu(); + while (rcvrs) { + rcvr = rcvrs; + rcvrs = rcvr->next; kfree(rcvr); - return 0; - } else { - mutex_unlock(&intf->cmd_rcvrs_mutex); - return -ENOENT; } + return rv; } void ipmi_user_set_run_to_completion(ipmi_user_t user, int val) { ipmi_smi_t intf = user->intf; - intf->handlers->set_run_to_completion(intf->send_info, val); + if (intf->handlers) + intf->handlers->set_run_to_completion(intf->send_info, val); } static unsigned char @@ -1136,10 +1315,11 @@ static int i_ipmi_request(ipmi_user_t int retries, unsigned int retry_time_ms) { - int rv = 0; - struct ipmi_smi_msg *smi_msg; - struct ipmi_recv_msg *recv_msg; - unsigned long flags; + int rv = 0; + struct ipmi_smi_msg *smi_msg; + struct ipmi_recv_msg *recv_msg; + unsigned long flags; + struct ipmi_smi_handlers *handlers; if (supplied_recv) { @@ -1162,6 +1342,13 @@ static int i_ipmi_request(ipmi_user_t } } + rcu_read_lock(); + handlers = intf->handlers; + if (!handlers) { + rv = -ENODEV; + goto out_err; + } + recv_msg->user = user; if (user) kref_get(&user->refcount); @@ -1204,6 +1391,24 @@ static int i_ipmi_request(ipmi_user_t goto out_err; } + if (((msg->netfn == IPMI_NETFN_APP_REQUEST) + && ((msg->cmd == IPMI_COLD_RESET_CMD) + || (msg->cmd == IPMI_WARM_RESET_CMD))) + || (msg->netfn == IPMI_NETFN_FIRMWARE_REQUEST)) + { + spin_lock_irqsave(&intf->maintenance_mode_lock, flags); + intf->auto_maintenance_timeout + = IPMI_MAINTENANCE_MODE_TIMEOUT; + if (!intf->maintenance_mode + && !intf->maintenance_mode_enable) + { + intf->maintenance_mode_enable = 1; + maintenance_mode_update(intf); + } + spin_unlock_irqrestore(&intf->maintenance_mode_lock, + flags); + } + if ((msg->data_len + 2) > IPMI_MAX_MSG_LENGTH) { spin_lock_irqsave(&intf->counter_lock, flags); intf->sent_invalid_commands++; @@ -1478,11 +1683,14 @@ static int i_ipmi_request(ipmi_user_t printk("\n"); } #endif - intf->handlers->sender(intf->send_info, smi_msg, priority); + + handlers->sender(intf->send_info, smi_msg, priority); + rcu_read_unlock(); return 0; out_err: + rcu_read_unlock(); ipmi_free_smi_msg(smi_msg); ipmi_free_recv_msg(recv_msg); return rv; @@ -1562,6 +1770,7 @@ int ipmi_request_supply_msgs(ipmi_user_t -1, 0); } +#ifdef CONFIG_PROC_FS static int ipmb_file_read_proc(char *page, char **start, off_t off, int count, int *eof, void *data) { @@ -1650,6 +1859,7 @@ static int stat_file_read_proc(char *pag return (out - ((char *) page)); } +#endif /* CONFIG_PROC_FS */ int ipmi_smi_add_proc_entry(ipmi_smi_t smi, char *name, read_proc_t *read_proc, write_proc_t *write_proc, @@ -1811,7 +2021,7 @@ static ssize_t provides_dev_sdrs_show(st struct bmc_device *bmc = dev_get_drvdata(dev); return snprintf(buf, 10, "%u\n", - bmc->id.device_revision && 0x80 >> 7); + (bmc->id.device_revision & 0x80) >> 7); } static ssize_t revision_show(struct device *dev, struct device_attribute *attr, @@ -1820,7 +2030,7 @@ static ssize_t revision_show(struct devi struct bmc_device *bmc = dev_get_drvdata(dev); return snprintf(buf, 20, "%u\n", - bmc->id.device_revision && 0x0F); + bmc->id.device_revision & 0x0F); } static ssize_t firmware_rev_show(struct device *dev, @@ -1933,8 +2143,7 @@ cleanup_bmc_device(struct kref *ref) bmc = container_of(ref, struct bmc_device, refcount); remove_files(bmc); - if (bmc->dev) - platform_device_unregister(bmc->dev); + platform_device_unregister(bmc->dev); kfree(bmc); } @@ -2132,8 +2341,7 @@ static int ipmi_bmc_register(ipmi_smi_t while (ipmi_find_bmc_prod_dev_id(&ipmidriver, bmc->id.product_id, - bmc->id.device_id)) - { + bmc->id.device_id)) { if (!warn_printed) { printk(KERN_WARNING PFX "This machine has two different BMCs" @@ -2428,12 +2636,8 @@ int ipmi_register_smi(struct ipmi_smi_ha int i, j; int rv; ipmi_smi_t intf; - unsigned long flags; - int version_major; - int version_minor; - - version_major = ipmi_version_major(device_id); - version_minor = ipmi_version_minor(device_id); + ipmi_smi_t tintf; + struct list_head *link; /* Make sure the driver is actually initialized, this handles problems with initialization order. */ @@ -2451,12 +2655,16 @@ int ipmi_register_smi(struct ipmi_smi_ha if (!intf) return -ENOMEM; memset(intf, 0, sizeof(*intf)); + + intf->ipmi_version_major = ipmi_version_major(device_id); + intf->ipmi_version_minor = ipmi_version_minor(device_id); + intf->bmc = kzalloc(sizeof(*intf->bmc), GFP_KERNEL); if (!intf->bmc) { kfree(intf); return -ENOMEM; } - intf->intf_num = -1; + intf->intf_num = -1; /* Mark it invalid for now. */ kref_init(&intf->refcount); intf->bmc->id = *device_id; intf->si_dev = si_dev; @@ -2484,26 +2692,30 @@ int ipmi_register_smi(struct ipmi_smi_ha INIT_LIST_HEAD(&intf->waiting_events); intf->waiting_events_count = 0; mutex_init(&intf->cmd_rcvrs_mutex); + spin_lock_init(&intf->maintenance_mode_lock); INIT_LIST_HEAD(&intf->cmd_rcvrs); init_waitqueue_head(&intf->waitq); spin_lock_init(&intf->counter_lock); intf->proc_dir = NULL; - rv = -ENOMEM; - spin_lock_irqsave(&interfaces_lock, flags); - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - if (ipmi_interfaces[i] == NULL) { - intf->intf_num = i; - /* Reserve the entry till we are done. */ - ipmi_interfaces[i] = IPMI_INVALID_INTERFACE_ENTRY; - rv = 0; + mutex_lock(&smi_watchers_mutex); + mutex_lock(&ipmi_interfaces_mutex); + /* Look for a hole in the numbers. */ + i = 0; + link = &ipmi_interfaces; + list_for_each_entry_rcu(tintf, &ipmi_interfaces, link) { + if (tintf->intf_num != i) { + link = &tintf->link; break; } + i++; } - spin_unlock_irqrestore(&interfaces_lock, flags); - if (rv) - goto out; + /* Add the new interface in numeric order. */ + if (i == 0) + list_add_rcu(&intf->link, &ipmi_interfaces); + else + list_add_tail_rcu(&intf->link, link); rv = handlers->start_processing(send_info, intf); if (rv) @@ -2511,8 +2723,9 @@ int ipmi_register_smi(struct ipmi_smi_ha get_guid(intf); - if ((version_major > 1) - || ((version_major == 1) && (version_minor >= 5))) + if ((intf->ipmi_version_major > 1) + || ((intf->ipmi_version_major == 1) + && (intf->ipmi_version_minor >= 5))) { /* Start scanning the channels to see what is available. */ @@ -2541,58 +2754,67 @@ int ipmi_register_smi(struct ipmi_smi_ha if (rv) { if (intf->proc_dir) remove_proc_entries(intf); + intf->handlers = NULL; + list_del_rcu(&intf->link); + mutex_unlock(&ipmi_interfaces_mutex); + mutex_unlock(&smi_watchers_mutex); + synchronize_rcu(); kref_put(&intf->refcount, intf_free); - if (i < MAX_IPMI_INTERFACES) { - spin_lock_irqsave(&interfaces_lock, flags); - ipmi_interfaces[i] = NULL; - spin_unlock_irqrestore(&interfaces_lock, flags); - } } else { - spin_lock_irqsave(&interfaces_lock, flags); - ipmi_interfaces[i] = intf; - spin_unlock_irqrestore(&interfaces_lock, flags); + /* + * Keep memory order straight for RCU readers. Make + * sure everything else is committed to memory before + * setting intf_num to mark the interface valid. + */ + smp_wmb(); + intf->intf_num = i; + mutex_unlock(&ipmi_interfaces_mutex); + /* After this point the interface is legal to use. */ call_smi_watchers(i, intf->si_dev); + mutex_unlock(&smi_watchers_mutex); } return rv; } +static void cleanup_smi_msgs(ipmi_smi_t intf) +{ + int i; + struct seq_table *ent; + + /* No need for locks, the interface is down. */ + for (i = 0; i < IPMI_IPMB_NUM_SEQ; i++) { + ent = &(intf->seq_table[i]); + if (!ent->inuse) + continue; + deliver_err_response(ent->recv_msg, IPMI_ERR_UNSPECIFIED); + } +} + int ipmi_unregister_smi(ipmi_smi_t intf) { - int i; struct ipmi_smi_watcher *w; - unsigned long flags; + int intf_num = intf->intf_num; ipmi_bmc_unregister(intf); - spin_lock_irqsave(&interfaces_lock, flags); - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - if (ipmi_interfaces[i] == intf) { - /* Set the interface number reserved until we - * are done. */ - ipmi_interfaces[i] = IPMI_INVALID_INTERFACE_ENTRY; - intf->intf_num = -1; - break; - } - } - spin_unlock_irqrestore(&interfaces_lock,flags); + mutex_lock(&smi_watchers_mutex); + mutex_lock(&ipmi_interfaces_mutex); + intf->intf_num = -1; + intf->handlers = NULL; + list_del_rcu(&intf->link); + mutex_unlock(&ipmi_interfaces_mutex); + synchronize_rcu(); - if (i == MAX_IPMI_INTERFACES) - return -ENODEV; + cleanup_smi_msgs(intf); remove_proc_entries(intf); /* Call all the watcher interfaces to tell them that an interface is gone. */ - down_read(&smi_watchers_sem); list_for_each_entry(w, &smi_watchers, link) - w->smi_gone(i); - up_read(&smi_watchers_sem); - - /* Allow the entry to be reused now. */ - spin_lock_irqsave(&interfaces_lock, flags); - ipmi_interfaces[i] = NULL; - spin_unlock_irqrestore(&interfaces_lock,flags); + w->smi_gone(intf_num); + mutex_unlock(&smi_watchers_mutex); kref_put(&intf->refcount, intf_free); return 0; @@ -2669,10 +2891,12 @@ static int handle_ipmb_get_msg_cmd(ipmi_ int rv = 0; unsigned char netfn; unsigned char cmd; + unsigned char chan; ipmi_user_t user = NULL; struct ipmi_ipmb_addr *ipmb_addr; struct ipmi_recv_msg *recv_msg; unsigned long flags; + struct ipmi_smi_handlers *handlers; if (msg->rsp_size < 10) { /* Message not big enough, just ignore it. */ @@ -2689,9 +2913,10 @@ static int handle_ipmb_get_msg_cmd(ipmi_ netfn = msg->rsp[4] >> 2; cmd = msg->rsp[8]; + chan = msg->rsp[3] & 0xf; rcu_read_lock(); - rcvr = find_cmd_rcvr(intf, netfn, cmd); + rcvr = find_cmd_rcvr(intf, netfn, cmd, chan); if (rcvr) { user = rcvr->user; kref_get(&user->refcount); @@ -2728,10 +2953,16 @@ static int handle_ipmb_get_msg_cmd(ipmi_ printk("\n"); } #endif - intf->handlers->sender(intf->send_info, msg, 0); - - rv = -1; /* We used the message, so return the value that - causes it to not be freed or queued. */ + rcu_read_lock(); + handlers = intf->handlers; + if (handlers) { + handlers->sender(intf->send_info, msg, 0); + /* We used the message, so return the value + that causes it to not be freed or + queued. */ + rv = -1; + } + rcu_read_unlock(); } else { /* Deliver the message to the user. */ spin_lock_irqsave(&intf->counter_lock, flags); @@ -2849,6 +3080,7 @@ static int handle_lan_get_msg_cmd(ipmi_s int rv = 0; unsigned char netfn; unsigned char cmd; + unsigned char chan; ipmi_user_t user = NULL; struct ipmi_lan_addr *lan_addr; struct ipmi_recv_msg *recv_msg; @@ -2869,9 +3101,10 @@ static int handle_lan_get_msg_cmd(ipmi_s netfn = msg->rsp[6] >> 2; cmd = msg->rsp[10]; + chan = msg->rsp[3] & 0xf; rcu_read_lock(); - rcvr = find_cmd_rcvr(intf, netfn, cmd); + rcvr = find_cmd_rcvr(intf, netfn, cmd, chan); if (rcvr) { user = rcvr->user; kref_get(&user->refcount); @@ -3252,7 +3485,9 @@ void ipmi_smi_msg_received(ipmi_smi_t report the error immediately. */ if ((msg->rsp_size >= 3) && (msg->rsp[2] != 0) && (msg->rsp[2] != IPMI_NODE_BUSY_ERR) - && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR)) + && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR) + && (msg->rsp[2] != IPMI_BUS_ERR) + && (msg->rsp[2] != IPMI_NAK_ON_WRITE_ERR)) { int chan = msg->rsp[3] & 0xf; @@ -3317,16 +3552,6 @@ void ipmi_smi_watchdog_pretimeout(ipmi_s rcu_read_unlock(); } -static void -handle_msg_timeout(struct ipmi_recv_msg *msg) -{ - msg->recv_type = IPMI_RESPONSE_RECV_TYPE; - msg->msg_data[0] = IPMI_TIMEOUT_COMPLETION_CODE; - msg->msg.netfn |= 1; /* Convert to a response. */ - msg->msg.data_len = 1; - msg->msg.data = msg->msg_data; - deliver_response(msg); -} static struct ipmi_smi_msg * smi_from_recv_msg(ipmi_smi_t intf, struct ipmi_recv_msg *recv_msg, @@ -3358,7 +3583,11 @@ static void check_msg_timeout(ipmi_smi_t struct list_head *timeouts, long timeout_period, int slot, unsigned long *flags) { - struct ipmi_recv_msg *msg; + struct ipmi_recv_msg *msg; + struct ipmi_smi_handlers *handlers; + + if (intf->intf_num == -1) + return; if (!ent->inuse) return; @@ -3401,13 +3630,19 @@ static void check_msg_timeout(ipmi_smi_t return; spin_unlock_irqrestore(&intf->seq_lock, *flags); + /* Send the new message. We send with a zero * priority. It timed out, I doubt time is * that critical now, and high priority * messages are really only for messages to the * local MC, which don't get resent. */ - intf->handlers->sender(intf->send_info, - smi_msg, 0); + handlers = intf->handlers; + if (handlers) + intf->handlers->sender(intf->send_info, + smi_msg, 0); + else + ipmi_free_smi_msg(smi_msg); + spin_lock_irqsave(&intf->seq_lock, *flags); } } @@ -3419,18 +3654,10 @@ static void ipmi_timeout_handler(long ti struct ipmi_recv_msg *msg, *msg2; struct ipmi_smi_msg *smi_msg, *smi_msg2; unsigned long flags; - int i, j; - - INIT_LIST_HEAD(&timeouts); - - spin_lock(&interfaces_lock); - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - intf = ipmi_interfaces[i]; - if (IPMI_INVALID_INTERFACE(intf)) - continue; - kref_get(&intf->refcount); - spin_unlock(&interfaces_lock); + int i; + rcu_read_lock(); + list_for_each_entry_rcu(intf, &ipmi_interfaces, link) { /* See if any waiting messages need to be processed. */ spin_lock_irqsave(&intf->waiting_msgs_lock, flags); list_for_each_entry_safe(smi_msg, smi_msg2, @@ -3449,36 +3676,62 @@ static void ipmi_timeout_handler(long ti /* Go through the seq table and find any messages that have timed out, putting them in the timeouts list. */ + INIT_LIST_HEAD(&timeouts); spin_lock_irqsave(&intf->seq_lock, flags); - for (j = 0; j < IPMI_IPMB_NUM_SEQ; j++) - check_msg_timeout(intf, &(intf->seq_table[j]), - &timeouts, timeout_period, j, + for (i = 0; i < IPMI_IPMB_NUM_SEQ; i++) + check_msg_timeout(intf, &(intf->seq_table[i]), + &timeouts, timeout_period, i, &flags); spin_unlock_irqrestore(&intf->seq_lock, flags); list_for_each_entry_safe(msg, msg2, &timeouts, link) - handle_msg_timeout(msg); + deliver_err_response(msg, IPMI_TIMEOUT_COMPLETION_CODE); - kref_put(&intf->refcount, intf_free); - spin_lock(&interfaces_lock); + /* + * Maintenance mode handling. Check the timeout + * optimistically before we claim the lock. It may + * mean a timeout gets missed occasionally, but that + * only means the timeout gets extended by one period + * in that case. No big deal, and it avoids the lock + * most of the time. + */ + if (intf->auto_maintenance_timeout > 0) { + spin_lock_irqsave(&intf->maintenance_mode_lock, flags); + if (intf->auto_maintenance_timeout > 0) { + intf->auto_maintenance_timeout + -= timeout_period; + if (!intf->maintenance_mode + && (intf->auto_maintenance_timeout <= 0)) + { + intf->maintenance_mode_enable = 0; + maintenance_mode_update(intf); + } + } + spin_unlock_irqrestore(&intf->maintenance_mode_lock, + flags); + } } - spin_unlock(&interfaces_lock); + rcu_read_unlock(); } static void ipmi_request_event(void) { - ipmi_smi_t intf; - int i; + ipmi_smi_t intf; + struct ipmi_smi_handlers *handlers; - spin_lock(&interfaces_lock); - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - intf = ipmi_interfaces[i]; - if (IPMI_INVALID_INTERFACE(intf)) + rcu_read_lock(); + /* Called from the timer, no need to check if handlers is + * valid. */ + list_for_each_entry_rcu(intf, &ipmi_interfaces, link) { + /* No event requests when in maintenance mode. */ + if (intf->maintenance_mode_enable) continue; - intf->handlers->request_events(intf->send_info); + handlers = intf->handlers; + if (handlers) + handlers->request_events(intf->send_info); } - spin_unlock(&interfaces_lock); + rcu_read_unlock(); } static struct timer_list ipmi_timer; @@ -3607,7 +3860,6 @@ static void send_panic_events(char *str) struct kernel_ipmi_msg msg; ipmi_smi_t intf; unsigned char data[16]; - int i; struct ipmi_system_interface_addr *si; struct ipmi_addr addr; struct ipmi_smi_msg smi_msg; @@ -3641,9 +3893,9 @@ static void send_panic_events(char *str) recv_msg.done = dummy_recv_done_handler; /* For every registered interface, send the event. */ - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - intf = ipmi_interfaces[i]; - if (IPMI_INVALID_INTERFACE(intf)) + list_for_each_entry_rcu(intf, &ipmi_interfaces, link) { + if (!intf->handlers) + /* Interface is not ready. */ continue; /* Send the event announcing the panic. */ @@ -3668,15 +3920,24 @@ static void send_panic_events(char *str) if (!str) return; - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { + /* For every registered interface, send the event. */ + list_for_each_entry_rcu(intf, &ipmi_interfaces, link) { char *p = str; struct ipmi_ipmb_addr *ipmb; int j; - intf = ipmi_interfaces[i]; - if (IPMI_INVALID_INTERFACE(intf)) + if (intf->intf_num == -1) + /* Interface was not ready yet. */ continue; + /* + * intf_num is used as an marker to tell if the + * interface is valid. Thus we need a read barrier to + * make sure data fetched before checking intf_num + * won't be used. + */ + smp_rmb(); + /* First job here is to figure out where to send the OEM events. There's no way in IPMI to send OEM events using an event send command, so we have to @@ -3794,13 +4055,12 @@ static void send_panic_events(char *str) } #endif /* CONFIG_IPMI_PANIC_EVENT */ -static int has_panicked = 0; +static int has_panicked; static int panic_event(struct notifier_block *this, unsigned long event, void *ptr) { - int i; ipmi_smi_t intf; if (has_panicked) @@ -3808,9 +4068,9 @@ static int panic_event(struct notifier_b has_panicked = 1; /* For every registered interface, set it to run to completion. */ - for (i = 0; i < MAX_IPMI_INTERFACES; i++) { - intf = ipmi_interfaces[i]; - if (IPMI_INVALID_INTERFACE(intf)) + list_for_each_entry_rcu(intf, &ipmi_interfaces, link) { + if (!intf->handlers) + /* Interface is not ready. */ continue; intf->handlers->set_run_to_completion(intf->send_info, 1); @@ -3831,7 +4091,6 @@ static struct notifier_block panic_block static int ipmi_init_msghandler(void) { - int i; int rv; if (initialized) @@ -3846,9 +4105,6 @@ static int ipmi_init_msghandler(void) printk(KERN_INFO "ipmi message handler version " IPMI_DRIVER_VERSION "\n"); - for (i = 0; i < MAX_IPMI_INTERFACES; i++) - ipmi_interfaces[i] = NULL; - #ifdef CONFIG_PROC_FS proc_ipmi_root = proc_mkdir("ipmi", NULL); if (!proc_ipmi_root) { diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_poweroff.c linux-ipmi/drivers/char/ipmi/ipmi_poweroff.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_poweroff.c 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_poweroff.c 2007-06-11 15:23:49.000000000 -0400 @@ -43,6 +43,9 @@ #define PFX "IPMI poweroff: " +static void ipmi_po_smi_gone(int if_num); +static void ipmi_po_new_smi(int if_num, struct device *device); + /* Definitions for controlling power off (if the system supports it). It * conveniently matches the IPMI chassis control values. */ #define IPMI_CHASSIS_POWER_DOWN 0 /* power down, the default. */ @@ -51,6 +54,37 @@ /* the IPMI data command */ static int poweroff_powercycle; +/* Which interface to use, -1 means the first we see. */ +static int ifnum_to_use = -1; + +/* Our local state. */ +static int ready; +static ipmi_user_t ipmi_user; +static int ipmi_ifnum; +static void (*specific_poweroff_func)(ipmi_user_t user); + +/* Holds the old poweroff function so we can restore it on removal. */ +static void (*old_poweroff_func)(void); + +static int set_param_ifnum(const char *val, struct kernel_param *kp) +{ + int rv = param_set_int(val, kp); + if (rv) + return rv; + if ((ifnum_to_use < 0) || (ifnum_to_use == ipmi_ifnum)) + return 0; + + ipmi_po_smi_gone(ipmi_ifnum); + ipmi_po_new_smi(ifnum_to_use, NULL); + return 0; +} + +module_param_call(ifnum_to_use, set_param_ifnum, param_get_int, + &ifnum_to_use, 0644); +MODULE_PARM_DESC(ifnum_to_use, "The interface number to use for the watchdog " + "timer. Setting to -1 defaults to the first registered " + "interface"); + /* parameter definition to allow user to flag power cycle */ module_param(poweroff_powercycle, int, 0644); MODULE_PARM_DESC(poweroff_powercycle, " Set to non-zero to enable power cycle instead of power down. Power cycle is contingent on hardware support, otherwise it defaults back to power down."); @@ -142,6 +176,42 @@ static int ipmi_request_in_rc_mode(ipmi_ #define IPMI_ATCA_GET_ADDR_INFO_CMD 0x01 #define IPMI_PICMG_ID 0 +#define IPMI_NETFN_OEM 0x2e +#define IPMI_ATCA_PPS_GRACEFUL_RESTART 0x11 +#define IPMI_ATCA_PPS_IANA "\x00\x40\x0A" +#define IPMI_MOTOROLA_MANUFACTURER_ID 0x0000A1 +#define IPMI_MOTOROLA_PPS_IPMC_PRODUCT_ID 0x0051 + +static void (*atca_oem_poweroff_hook)(ipmi_user_t user); + +static void pps_poweroff_atca (ipmi_user_t user) +{ + struct ipmi_system_interface_addr smi_addr; + struct kernel_ipmi_msg send_msg; + int rv; + /* + * Configure IPMI address for local access + */ + smi_addr.addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + smi_addr.channel = IPMI_BMC_CHANNEL; + smi_addr.lun = 0; + + printk(KERN_INFO PFX "PPS powerdown hook used"); + + send_msg.netfn = IPMI_NETFN_OEM; + send_msg.cmd = IPMI_ATCA_PPS_GRACEFUL_RESTART; + send_msg.data = IPMI_ATCA_PPS_IANA; + send_msg.data_len = 3; + rv = ipmi_request_in_rc_mode(user, + (struct ipmi_addr *) &smi_addr, + &send_msg); + if (rv && rv != IPMI_UNKNOWN_ERR_COMPLETION_CODE) { + printk(KERN_ERR PFX "Unable to send ATCA ," + " IPMI error 0x%x\n", rv); + } + return; +} + static int ipmi_atca_detect (ipmi_user_t user) { struct ipmi_system_interface_addr smi_addr; @@ -167,6 +237,13 @@ static int ipmi_atca_detect (ipmi_user_t rv = ipmi_request_wait_for_response(user, (struct ipmi_addr *) &smi_addr, &send_msg); + + printk(KERN_INFO PFX "ATCA Detect mfg 0x%X prod 0x%X\n", mfg_id, prod_id); + if((mfg_id == IPMI_MOTOROLA_MANUFACTURER_ID) + && (prod_id == IPMI_MOTOROLA_PPS_IPMC_PRODUCT_ID)) { + printk(KERN_INFO PFX "Installing Pigeon Point Systems Poweroff Hook\n"); + atca_oem_poweroff_hook = pps_poweroff_atca; + } return !rv; } @@ -200,12 +277,19 @@ static void ipmi_poweroff_atca (ipmi_use rv = ipmi_request_in_rc_mode(user, (struct ipmi_addr *) &smi_addr, &send_msg); - if (rv) { + /** At this point, the system may be shutting down, and most + ** serial drivers (if used) will have interrupts turned off + ** it may be better to ignore IPMI_UNKNOWN_ERR_COMPLETION_CODE + ** return code + **/ + if (rv && rv != IPMI_UNKNOWN_ERR_COMPLETION_CODE) { printk(KERN_ERR PFX "Unable to send ATCA powerdown message," " IPMI error 0x%x\n", rv); goto out; } + if(atca_oem_poweroff_hook) + return atca_oem_poweroff_hook(user); out: return; } @@ -440,15 +524,6 @@ static struct poweroff_function poweroff / sizeof(struct poweroff_function)) -/* Our local state. */ -static int ready = 0; -static ipmi_user_t ipmi_user; -static void (*specific_poweroff_func)(ipmi_user_t user) = NULL; - -/* Holds the old poweroff function so we can restore it on removal. */ -static void (*old_poweroff_func)(void); - - /* Called on a powerdown request. */ static void ipmi_poweroff_function (void) { @@ -473,6 +548,9 @@ static void ipmi_po_new_smi(int if_num, if (ready) return; + if ((ifnum_to_use >= 0) && (ifnum_to_use != if_num)) + return; + rv = ipmi_create_user(if_num, &ipmi_poweroff_handler, NULL, &ipmi_user); if (rv) { @@ -481,7 +559,9 @@ static void ipmi_po_new_smi(int if_num, return; } - /* + ipmi_ifnum = if_num; + + /* * Do a get device ide and store some results, since this is * used by several functions. */ @@ -541,9 +621,15 @@ static void ipmi_po_new_smi(int if_num, static void ipmi_po_smi_gone(int if_num) { - /* This can never be called, because once poweroff driver is - registered, the interface can't go away until the power - driver is unregistered. */ + if (!ready) + return; + + if (ipmi_ifnum != if_num) + return; + + ready = 0; + ipmi_destroy_user(ipmi_user); + pm_power_off = old_poweroff_func; } static struct ipmi_smi_watcher smi_watcher = @@ -616,9 +702,9 @@ static int ipmi_poweroff_init (void) printk(KERN_ERR PFX "Unable to register SMI watcher: %d\n", rv); goto out_err; } -#endif out_err: +#endif return rv; } diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_si_intf.c linux-ipmi/drivers/char/ipmi/ipmi_si_intf.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_si_intf.c 2007-06-11 15:26:45.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_si_intf.c 2007-06-11 15:23:49.000000000 -0400 @@ -61,6 +61,10 @@ #include "ipmi_si_sm.h" #include <linux/init.h> #include <linux/dmi.h> +#include <linux/string.h> +#include <linux/ctype.h> + +#define PFX "ipmi_si: " /* Measure times between events in the driver. */ #undef DEBUG_TIMING @@ -92,7 +96,7 @@ enum si_intf_state { enum si_type { SI_KCS, SI_SMIC, SI_BT }; -static char *si_to_str[] = { "KCS", "SMIC", "BT" }; +static char *si_to_str[] = { "kcs", "smic", "bt" }; #define DEVICE_NAME "ipmi_si" @@ -217,7 +221,15 @@ struct smi_info struct list_head link; }; +#define SI_MAX_PARMS 4 + +static int force_kipmid[SI_MAX_PARMS]; +static int num_force_kipmid; + +static int unload_when_empty = 1; + static int try_smi_init(struct smi_info *smi); +static void cleanup_one_si(struct smi_info *to_clean); static ATOMIC_NOTIFIER_HEAD(xaction_notifier_list); static int register_xaction_notifier(struct notifier_block * nb) @@ -235,14 +247,18 @@ static void deliver_recv_msg(struct smi_ spin_lock(&(smi_info->si_lock)); } -static void return_hosed_msg(struct smi_info *smi_info) +static void return_hosed_msg(struct smi_info *smi_info, int cCode) { struct ipmi_smi_msg *msg = smi_info->curr_msg; + if (cCode < 0 || cCode > IPMI_ERR_UNSPECIFIED) + cCode = IPMI_ERR_UNSPECIFIED; + /* else use it as is */ + /* Make it a reponse */ msg->rsp[0] = msg->data[0] | 4; msg->rsp[1] = msg->data[1]; - msg->rsp[2] = 0xFF; /* Unknown error. */ + msg->rsp[2] = cCode; msg->rsp_size = 3; smi_info->curr_msg = NULL; @@ -293,7 +309,7 @@ static enum si_sm_result start_next_msg( smi_info->curr_msg->data, smi_info->curr_msg->data_size); if (err) { - return_hosed_msg(smi_info); + return_hosed_msg(smi_info, err); } rv = SI_SM_CALL_WITHOUT_DELAY; @@ -635,7 +651,7 @@ static enum si_sm_result smi_event_handl /* If we were handling a user message, format a response to send to the upper layer to tell it about the error. */ - return_hosed_msg(smi_info); + return_hosed_msg(smi_info, IPMI_ERR_UNSPECIFIED); } si_sm_result = smi_info->handlers->event(smi_info->si_sm, 0); } @@ -679,22 +695,24 @@ static enum si_sm_result smi_event_handl { /* We are idle and the upper layer requested that I fetch events, so do so. */ - unsigned char msg[2]; + atomic_set(&smi_info->req_events, 0); - spin_lock(&smi_info->count_lock); - smi_info->flag_fetches++; - spin_unlock(&smi_info->count_lock); + smi_info->curr_msg = ipmi_alloc_smi_msg(); + if (!smi_info->curr_msg) + goto out; - atomic_set(&smi_info->req_events, 0); - msg[0] = (IPMI_NETFN_APP_REQUEST << 2); - msg[1] = IPMI_GET_MSG_FLAGS_CMD; + smi_info->curr_msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); + smi_info->curr_msg->data[1] = IPMI_READ_EVENT_MSG_BUFFER_CMD; + smi_info->curr_msg->data_size = 2; smi_info->handlers->start_transaction( - smi_info->si_sm, msg, 2); - smi_info->si_state = SI_GETTING_FLAGS; + smi_info->si_sm, + smi_info->curr_msg->data, + smi_info->curr_msg->data_size); + smi_info->si_state = SI_GETTING_EVENTS; goto restart; } - + out: return si_sm_result; } @@ -709,6 +727,15 @@ static void sender(void * struct timeval t; #endif + if (atomic_read(&smi_info->stop_operation)) { + msg->rsp[0] = msg->data[0] | 4; + msg->rsp[1] = msg->data[1]; + msg->rsp[2] = IPMI_ERR_UNSPECIFIED; + msg->rsp_size = 3; + deliver_recv_msg(smi_info, msg); + return; + } + spin_lock_irqsave(&(smi_info->msg_lock), flags); #ifdef DEBUG_TIMING do_gettimeofday(&t); @@ -800,17 +827,25 @@ static void poll(void *send_info) { struct smi_info *smi_info = send_info; - smi_event_handler(smi_info, 0); + /* + * Make sure there is some delay in the poll loop so we can + * drive time forward and timeout things. + */ + udelay(10); + smi_event_handler(smi_info, 10); } static void request_events(void *send_info) { struct smi_info *smi_info = send_info; + if (atomic_read(&smi_info->stop_operation)) + return; + atomic_set(&smi_info->req_events, 1); } -static int initialized = 0; +static int initialized; static void smi_timeout(unsigned long data) { @@ -908,6 +943,7 @@ static int smi_start_processing(void ipmi_smi_t intf) { struct smi_info *new_smi = send_info; + int enable = 0; new_smi->intf = intf; @@ -916,7 +952,19 @@ static int smi_start_processing(void new_smi->last_timeout_jiffies = jiffies; mod_timer(&new_smi->si_timer, jiffies + SI_TIMEOUT_JIFFIES); - if (new_smi->si_type != SI_BT) { + /* + * Check if the user forcefully enabled the daemon. + */ + if (new_smi->intf_num < num_force_kipmid) + enable = force_kipmid[new_smi->intf_num]; + /* + * The BT interface is efficient enough to not need a thread, + * and there is no need for a thread if we have interrupts. + */ + else if ((new_smi->si_type != SI_BT) && (!new_smi->irq)) + enable = 1; + + if (enable) { new_smi->thread = kthread_run(ipmi_thread, new_smi, "kipmi%d", new_smi->intf_num); if (IS_ERR(new_smi->thread)) { @@ -931,12 +979,21 @@ static int smi_start_processing(void return 0; } +static void set_maintenance_mode(void *send_info, int enable) +{ + struct smi_info *smi_info = send_info; + + if (!enable) + atomic_set(&smi_info->req_events, 0); +} + static struct ipmi_smi_handlers handlers = { .owner = THIS_MODULE, .start_processing = smi_start_processing, .sender = sender, .request_events = request_events, + .set_maintenance_mode = set_maintenance_mode, .set_run_to_completion = set_run_to_completion, .poll = poll, }; @@ -944,7 +1001,6 @@ static struct ipmi_smi_handlers handlers /* There can be 4 IO ports passed in (with or without IRQs), 4 addresses, a default IO port, and 1 ACPI/SPMI address. That sets SI_MAX_DRIVERS */ -#define SI_MAX_PARMS 4 static LIST_HEAD(smi_infos); static DEFINE_MUTEX(smi_infos_lock); static int smi_num; /* Used to sequence the SMIs */ @@ -962,14 +1018,24 @@ static int num_ports; static int irqs[SI_MAX_PARMS]; static int num_irqs; static int regspacings[SI_MAX_PARMS]; -static int num_regspacings = 0; +static int num_regspacings; static int regsizes[SI_MAX_PARMS]; -static int num_regsizes = 0; +static int num_regsizes; static int regshifts[SI_MAX_PARMS]; -static int num_regshifts = 0; +static int num_regshifts; static int slave_addrs[SI_MAX_PARMS]; -static int num_slave_addrs = 0; +static int num_slave_addrs; +#define IPMI_IO_ADDR_SPACE 0 +#define IPMI_MEM_ADDR_SPACE 1 +static char *addr_space_to_str[] = { "i/o", "mem" }; + +static int hotmod_handler(const char *val, struct kernel_param *kp); + +module_param_call(hotmod, hotmod_handler, NULL, NULL, 0200); +MODULE_PARM_DESC(hotmod, "Add and remove interfaces. See" + " Documentation/IPMI.txt in the kernel sources for the" + " gory details."); module_param_named(trydefaults, si_trydefaults, bool, 0); MODULE_PARM_DESC(trydefaults, "Setting this to 'false' will disable the" @@ -1017,12 +1083,16 @@ MODULE_PARM_DESC(slave_addrs, "Set the d " the controller. Normally this is 0x20, but can be" " overridden by this parm. This is an array indexed" " by interface number."); +module_param_array(force_kipmid, int, &num_force_kipmid, 0); +MODULE_PARM_DESC(force_kipmid, "Force the kipmi daemon to be enabled (1) or" + " disabled(0). Normally the IPMI driver auto-detects" + " this, but the value may be overridden by this parm."); +module_param(unload_when_empty, int, 0); +MODULE_PARM_DESC(unload_when_empty, "Unload the module if no interfaces are" + " specified or found, default is 1. Setting to 0" + " is useful for hot add of devices using hotmod."); -#define IPMI_IO_ADDR_SPACE 0 -#define IPMI_MEM_ADDR_SPACE 1 -static char *addr_space_to_str[] = { "I/O", "memory" }; - static void std_irq_cleanup(struct smi_info *info) { if (info->si_type == SI_BT) @@ -1190,7 +1260,7 @@ static void intf_mem_outb(struct si_sm_i static unsigned char intf_mem_inw(struct si_sm_io *io, unsigned int offset) { return (readw((io->addr)+(offset * io->regspacing)) >> io->regshift) - && 0xff; + & 0xff; } static void intf_mem_outw(struct si_sm_io *io, unsigned int offset, @@ -1202,7 +1272,7 @@ static void intf_mem_outw(struct si_sm_i static unsigned char intf_mem_inl(struct si_sm_io *io, unsigned int offset) { return (readl((io->addr)+(offset * io->regspacing)) >> io->regshift) - && 0xff; + & 0xff; } static void intf_mem_outl(struct si_sm_io *io, unsigned int offset, @@ -1215,7 +1285,7 @@ static void intf_mem_outl(struct si_sm_i static unsigned char mem_inq(struct si_sm_io *io, unsigned int offset) { return (readq((io->addr)+(offset * io->regspacing)) >> io->regshift) - && 0xff; + & 0xff; } static void mem_outq(struct si_sm_io *io, unsigned int offset, @@ -1296,6 +1366,250 @@ static int mem_setup(struct smi_info *in return 0; } +/* + * Parms come in as <op1>[:op2[:op3...]]. ops are: + * add|remove,kcs|bt|smic,mem|i/o,<address>[,<opt1>[,<opt2>[,...]]] + * Options are: + * rsp=<regspacing> + * rsi=<regsize> + * rsh=<regshift> + * irq=<irq> + * ipmb=<ipmb addr> + */ +enum hotmod_op { HM_ADD, HM_REMOVE }; +struct hotmod_vals { + char *name; + int val; +}; +static struct hotmod_vals hotmod_ops[] = { + { "add", HM_ADD }, + { "remove", HM_REMOVE }, + { NULL } +}; +static struct hotmod_vals hotmod_si[] = { + { "kcs", SI_KCS }, + { "smic", SI_SMIC }, + { "bt", SI_BT }, + { NULL } +}; +static struct hotmod_vals hotmod_as[] = { + { "mem", IPMI_MEM_ADDR_SPACE }, + { "i/o", IPMI_IO_ADDR_SPACE }, + { NULL } +}; + +static int parse_str(struct hotmod_vals *v, int *val, char *name, char **curr) +{ + char *s; + int i; + + s = strchr(*curr, ','); + if (!s) { + printk(KERN_WARNING PFX "No hotmod %s given.\n", name); + return -EINVAL; + } + *s = '\0'; + s++; + for (i = 0; hotmod_ops[i].name; i++) { + if (strcmp(*curr, v[i].name) == 0) { + *val = v[i].val; + *curr = s; + return 0; + } + } + + printk(KERN_WARNING PFX "Invalid hotmod %s '%s'\n", name, *curr); + return -EINVAL; +} + +static int check_hotmod_int_op(const char *curr, const char *option, + const char *name, int *val) +{ + char *n; + + if (strcmp(curr, name) == 0) { + if (!option) { + printk(KERN_WARNING PFX + "No option given for '%s'\n", + curr); + return -EINVAL; + } + *val = simple_strtoul(option, &n, 0); + if ((*n != '\0') || (*option == '\0')) { + printk(KERN_WARNING PFX + "Bad option given for '%s'\n", + curr); + return -EINVAL; + } + return 1; + } + return 0; +} + +static int hotmod_handler(const char *val, struct kernel_param *kp) +{ + char *str = kstrdup(val, GFP_KERNEL); + int rv; + char *next, *curr, *s, *n, *o; + enum hotmod_op op; + enum si_type si_type; + int addr_space; + unsigned long addr; + int regspacing; + int regsize; + int regshift; + int irq; + int ipmb; + int ival; + int len; + struct smi_info *info; + + if (!str) + return -ENOMEM; + + /* Kill any trailing spaces, as we can get a "\n" from echo. */ + len = strlen(str); + ival = len - 1; + while ((ival >= 0) && isspace(str[ival])) { + str[ival] = '\0'; + ival--; + } + + for (curr = str; curr; curr = next) { + regspacing = 1; + regsize = 1; + regshift = 0; + irq = 0; + ipmb = 0x20; + + next = strchr(curr, ':'); + if (next) { + *next = '\0'; + next++; + } + + rv = parse_str(hotmod_ops, &ival, "operation", &curr); + if (rv) + break; + op = ival; + + rv = parse_str(hotmod_si, &ival, "interface type", &curr); + if (rv) + break; + si_type = ival; + + rv = parse_str(hotmod_as, &addr_space, "address space", &curr); + if (rv) + break; + + s = strchr(curr, ','); + if (s) { + *s = '\0'; + s++; + } + addr = simple_strtoul(curr, &n, 0); + if ((*n != '\0') || (*curr == '\0')) { + printk(KERN_WARNING PFX "Invalid hotmod address" + " '%s'\n", curr); + break; + } + + while (s) { + curr = s; + s = strchr(curr, ','); + if (s) { + *s = '\0'; + s++; + } + o = strchr(curr, '='); + if (o) { + *o = '\0'; + o++; + } + rv = check_hotmod_int_op(curr, o, "rsp", ®spacing); + if (rv < 0) + goto out; + else if (rv) + continue; + rv = check_hotmod_int_op(curr, o, "rsi", ®size); + if (rv < 0) + goto out; + else if (rv) + continue; + rv = check_hotmod_int_op(curr, o, "rsh", ®shift); + if (rv < 0) + goto out; + else if (rv) + continue; + rv = check_hotmod_int_op(curr, o, "irq", &irq); + if (rv < 0) + goto out; + else if (rv) + continue; + rv = check_hotmod_int_op(curr, o, "ipmb", &ipmb); + if (rv < 0) + goto out; + else if (rv) + continue; + + rv = -EINVAL; + printk(KERN_WARNING PFX + "Invalid hotmod option '%s'\n", + curr); + goto out; + } + + if (op == HM_ADD) { + info = kzalloc(sizeof(*info), GFP_KERNEL); + if (!info) { + rv = -ENOMEM; + goto out; + } + + info->addr_source = "hotmod"; + info->si_type = si_type; + info->io.addr_data = addr; + info->io.addr_type = addr_space; + if (addr_space == IPMI_MEM_ADDR_SPACE) + info->io_setup = mem_setup; + else + info->io_setup = port_setup; + + info->io.addr = NULL; + info->io.regspacing = regspacing; + if (!info->io.regspacing) + info->io.regspacing = DEFAULT_REGSPACING; + info->io.regsize = regsize; + if (!info->io.regsize) + info->io.regsize = DEFAULT_REGSPACING; + info->io.regshift = regshift; + info->irq = irq; + if (info->irq) + info->irq_setup = std_irq_setup; + info->slave_addr = ipmb; + + try_smi_init(info); + } else { + /* remove */ + struct smi_info *e, *tmp_e; + + mutex_lock(&smi_infos_lock); + list_for_each_entry_safe(e, tmp_e, &smi_infos, link) { + if (e->io.addr_type != addr_space) + continue; + if (e->si_type != si_type) + continue; + if (e->io.addr_data == addr) + cleanup_one_si(e); + } + mutex_unlock(&smi_infos_lock); + } + } + rv = len; + out: + kfree(str); + return rv; +} static __devinit void hardcode_find_bmc(void) { @@ -1370,7 +1684,7 @@ static __devinit void hardcode_find_bmc( /* Once we get an ACPI failure, we don't try any more, because we go through the tables sequentially. Once we don't find a table, there are no more. */ -static int acpi_failure = 0; +static int acpi_failure; /* For GPE-type interrupts. */ static u32 ipmi_acpi_gpe(void *context) @@ -1481,7 +1795,6 @@ struct SPMITable { static __devinit int try_init_acpi(struct SPMITable *spmi) { struct smi_info *info; - char *io_type; u8 addr_space; if (spmi->IPMIlegacy != 1) { @@ -1545,13 +1858,11 @@ static __devinit int try_init_acpi(struc info->io.regshift = spmi->addr.register_bit_offset; if (spmi->addr.address_space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) { - io_type = "memory"; info->io_setup = mem_setup; - info->io.addr_type = IPMI_IO_ADDR_SPACE; + info->io.addr_type = IPMI_MEM_ADDR_SPACE; } else if (spmi->addr.address_space_id == ACPI_ADR_SPACE_SYSTEM_IO) { - io_type = "I/O"; info->io_setup = port_setup; - info->io.addr_type = IPMI_MEM_ADDR_SPACE; + info->io.addr_type = IPMI_IO_ADDR_SPACE; } else { kfree(info); printk("ipmi_si: Unknown ACPI I/O Address type\n"); @@ -1730,6 +2041,7 @@ static void __devinit dmi_find_bmc(void) int rv; while ((dev = dmi_find_device(DMI_DEV_TYPE_IPMI, NULL, dev))) { + memset(&data, 0, sizeof(data)); rv = decode_dmi((struct dmi_header *) dev->device_data, &data); if (!rv) try_init_dmi(&data); @@ -1767,7 +2079,7 @@ static int __devinit ipmi_pci_probe(stru info = kzalloc(sizeof(*info), GFP_KERNEL); if (!info) - return ENOMEM; + return -ENOMEM; info->addr_source = "PCI"; @@ -1788,7 +2100,7 @@ static int __devinit ipmi_pci_probe(stru kfree(info); printk(KERN_INFO "ipmi_si: %s: Unknown IPMI type: %d\n", pci_name(pdev), class_type); - return ENOMEM; + return -ENOMEM; } rv = pci_enable_device(pdev); @@ -1823,12 +2135,16 @@ static int __devinit ipmi_pci_probe(stru info->irq_setup = std_irq_setup; info->dev = &pdev->dev; + pci_set_drvdata(pdev, info); return try_smi_init(info); } static void __devexit ipmi_pci_remove(struct pci_dev *pdev) { + struct smi_info *info = pci_get_drvdata(pdev); + pci_set_drvdata(pdev, NULL); + cleanup_one_si(info); } #ifdef CONFIG_PM @@ -1930,19 +2246,9 @@ static int try_get_dev_id(struct smi_inf static int type_file_read_proc(char *page, char **start, off_t off, int count, int *eof, void *data) { - char *out = (char *) page; struct smi_info *smi = data; - switch (smi->si_type) { - case SI_KCS: - return sprintf(out, "kcs\n"); - case SI_SMIC: - return sprintf(out, "smic\n"); - case SI_BT: - return sprintf(out, "bt\n"); - default: - return 0; - } + return sprintf(page, "%s\n", si_to_str[smi->si_type]); } static int stat_file_read_proc(char *page, char **start, off_t off, @@ -1978,7 +2284,24 @@ static int stat_file_read_proc(char *pag out += sprintf(out, "incoming_messages: %ld\n", smi->incoming_messages); - return (out - ((char *) page)); + return out - page; +} + +static int param_read_proc(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + struct smi_info *smi = data; + + return sprintf(page, + "%s,%s,0x%lx,rsp=%d,rsi=%d,rsh=%d,irq=%d,ipmb=%d\n", + si_to_str[smi->si_type], + addr_space_to_str[smi->io.addr_type], + smi->io.addr_data, + smi->io.regspacing, + smi->io.regsize, + smi->io.regshift, + smi->irq, + smi->slave_addr); } /* @@ -2161,6 +2484,11 @@ static __devinit void default_find_bmc(v if (!info) return; +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(ipmi_defaults[i].port)) + continue; +#endif + info->addr_source = NULL; info->si_type = ipmi_defaults[i].type; @@ -2369,6 +2697,16 @@ static int try_smi_init(struct smi_info goto out_err_stop_timer; } + rv = ipmi_smi_add_proc_entry(new_smi->intf, "params", + param_read_proc, NULL, + new_smi, THIS_MODULE); + if (rv) { + printk(KERN_ERR + "ipmi_si: Unable to create proc entry: %d\n", + rv); + goto out_err_stop_timer; + } + list_add_tail(&new_smi->link, &smi_infos); mutex_unlock(&smi_infos_lock); @@ -2457,12 +2795,16 @@ static __devinit int init_ipmi_si(void) #endif #ifdef CONFIG_ACPI - if (si_trydefaults) - acpi_find_bmc(); + acpi_find_bmc(); #endif #ifdef CONFIG_PCI - pci_module_init(&ipmi_pci_driver); + rv = pci_register_driver(&ipmi_pci_driver); + if (rv){ + printk(KERN_ERR + "init_ipmi_si: Unable to register PCI driver: %d\n", + rv); + } #endif if (si_trydefaults) { @@ -2477,7 +2819,7 @@ static __devinit int init_ipmi_si(void) } mutex_lock(&smi_infos_lock); - if (list_empty(&smi_infos)) { + if (unload_when_empty && list_empty(&smi_infos)) { mutex_unlock(&smi_infos_lock); #ifdef CONFIG_PCI pci_unregister_driver(&ipmi_pci_driver); @@ -2492,7 +2834,7 @@ static __devinit int init_ipmi_si(void) } module_init(init_ipmi_si); -static void __devexit cleanup_one_si(struct smi_info *to_clean) +static void cleanup_one_si(struct smi_info *to_clean) { int rv; unsigned long flags; diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_smic_sm.c linux-ipmi/drivers/char/ipmi/ipmi_smic_sm.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_smic_sm.c 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_smic_sm.c 2007-06-11 15:23:49.000000000 -0400 @@ -141,12 +141,14 @@ static int start_smic_transaction(struct { unsigned int i; - if ((size < 2) || (size > MAX_SMIC_WRITE_SIZE)) { - return -1; - } - if ((smic->state != SMIC_IDLE) && (smic->state != SMIC_HOSED)) { - return -2; - } + if (size < 2) + return IPMI_REQ_LEN_INVALID_ERR; + if (size > MAX_SMIC_WRITE_SIZE) + return IPMI_REQ_LEN_EXCEEDED_ERR; + + if ((smic->state != SMIC_IDLE) && (smic->state != SMIC_HOSED)) + return IPMI_NOT_IN_MY_STATE_ERR; + if (smic_debug & SMIC_DEBUG_MSG) { printk(KERN_INFO "start_smic_transaction -"); for (i = 0; i < size; i ++) { diff -Naurp linux-2.6.18.noarch/drivers/char/ipmi/ipmi_watchdog.c linux-ipmi/drivers/char/ipmi/ipmi_watchdog.c --- linux-2.6.18.noarch/drivers/char/ipmi/ipmi_watchdog.c 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/drivers/char/ipmi/ipmi_watchdog.c 2007-06-11 15:23:49.000000000 -0400 @@ -134,13 +134,14 @@ static int nowayout = WATCHDOG_NOWAYOUT; -static ipmi_user_t watchdog_user = NULL; +static ipmi_user_t watchdog_user; +static int watchdog_ifnum; /* Default the timeout to 10 seconds. */ static int timeout = 10; /* The pre-timeout is disabled by default. */ -static int pretimeout = 0; +static int pretimeout; /* Default action is to reset the board on a timeout. */ static unsigned char action_val = WDOG_TIMEOUT_RESET; @@ -155,12 +156,14 @@ static unsigned char preop_val = WDOG_PR static char preop[16] = "preop_none"; static DEFINE_SPINLOCK(ipmi_read_lock); -static char data_to_read = 0; +static char data_to_read; static DECLARE_WAIT_QUEUE_HEAD(read_q); -static struct fasync_struct *fasync_q = NULL; -static char pretimeout_since_last_heartbeat = 0; +static struct fasync_struct *fasync_q; +static char pretimeout_since_last_heartbeat; static char expect_close; +static int ifnum_to_use = -1; + static DECLARE_RWSEM(register_sem); /* Parameters to ipmi_set_timeout */ @@ -169,10 +172,12 @@ static DECLARE_RWSEM(register_sem); #define IPMI_SET_TIMEOUT_FORCE_HB 2 static int ipmi_set_timeout(int do_heartbeat); +static void ipmi_register_watchdog(int ipmi_intf); +static void ipmi_unregister_watchdog(int ipmi_intf); /* If true, the driver will start running as soon as it is configured and ready. */ -static int start_now = 0; +static int start_now; static int set_param_int(const char *val, struct kernel_param *kp) { @@ -211,13 +216,13 @@ static int set_param_str(const char *val { action_fn fn = (action_fn) kp->arg; int rv = 0; - char *dup, *s; + char valcp[16]; + char *s; - dup = kstrdup(val, GFP_KERNEL); - if (!dup) - return -ENOMEM; + strncpy(valcp, val, 16); + valcp[15] = '\0'; - s = strstrip(dup); + s = strstrip(valcp); down_read(®ister_sem); rv = fn(s, NULL); @@ -230,7 +235,6 @@ static int set_param_str(const char *val out_unlock: up_read(®ister_sem); - kfree(dup); return rv; } @@ -245,6 +249,26 @@ static int get_param_str(char *buffer, s return strlen(buffer); } + +static int set_param_wdog_ifnum(const char *val, struct kernel_param *kp) +{ + int rv = param_set_int(val, kp); + if (rv) + return rv; + if ((ifnum_to_use < 0) || (ifnum_to_use == watchdog_ifnum)) + return 0; + + ipmi_unregister_watchdog(watchdog_ifnum); + ipmi_register_watchdog(ifnum_to_use); + return 0; +} + +module_param_call(ifnum_to_use, set_param_wdog_ifnum, get_param_int, + &ifnum_to_use, 0644); +MODULE_PARM_DESC(ifnum_to_use, "The interface number to use for the watchdog " + "timer. Setting to -1 defaults to the first registered " + "interface"); + module_param_call(timeout, set_param_int, get_param_int, &timeout, 0644); MODULE_PARM_DESC(timeout, "Timeout value in seconds."); @@ -263,27 +287,28 @@ module_param_call(preop, set_param_str, MODULE_PARM_DESC(preop, "Pretimeout driver operation. One of: " "preop_none, preop_panic, preop_give_data."); -module_param(start_now, int, 0); +module_param(start_now, int, 0444); MODULE_PARM_DESC(start_now, "Set to 1 to start the watchdog as" "soon as the driver is loaded."); module_param(nowayout, int, 0644); -MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default=CONFIG_WATCHDOG_NOWAYOUT)"); +MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started " + "(default=CONFIG_WATCHDOG_NOWAYOUT)"); /* Default state of the timer. */ static unsigned char ipmi_watchdog_state = WDOG_TIMEOUT_NONE; /* If shutting down via IPMI, we ignore the heartbeat. */ -static int ipmi_ignore_heartbeat = 0; +static int ipmi_ignore_heartbeat; /* Is someone using the watchdog? Only one user is allowed. */ -static unsigned long ipmi_wdog_open = 0; +static unsigned long ipmi_wdog_open; /* If set to 1, the heartbeat command will set the state to reset and start the timer. The timer doesn't normally run when the driver is first opened until the heartbeat is set the first time, this variable is used to accomplish this. */ -static int ipmi_start_timer_on_heartbeat = 0; +static int ipmi_start_timer_on_heartbeat; /* IPMI version of the BMC. */ static unsigned char ipmi_version_major; @@ -872,6 +897,11 @@ static void ipmi_register_watchdog(int i if (watchdog_user) goto out; + if ((ifnum_to_use >= 0) && (ifnum_to_use != ipmi_intf)) + goto out; + + watchdog_ifnum = ipmi_intf; + rv = ipmi_create_user(ipmi_intf, &ipmi_hndlrs, NULL, &watchdog_user); if (rv < 0) { printk(KERN_CRIT PFX "Unable to register with ipmi\n"); @@ -901,6 +931,39 @@ static void ipmi_register_watchdog(int i } } +static void ipmi_unregister_watchdog(int ipmi_intf) +{ + int rv; + + down_write(®ister_sem); + + if (!watchdog_user) + goto out; + + if (watchdog_ifnum != ipmi_intf) + goto out; + + /* Make sure no one can call us any more. */ + misc_deregister(&ipmi_wdog_miscdev); + + /* Wait to make sure the message makes it out. The lower layer has + pointers to our buffers, we want to make sure they are done before + we release our memory. */ + while (atomic_read(&set_timeout_tofree)) + schedule_timeout_uninterruptible(1); + + /* Disconnect from IPMI. */ + rv = ipmi_destroy_user(watchdog_user); + if (rv) { + printk(KERN_WARNING PFX "error unlinking from IPMI: %d\n", + rv); + } + watchdog_user = NULL; + + out: + up_write(®ister_sem); +} + #ifdef HAVE_NMI_HANDLER static int ipmi_nmi(void *dev_id, struct pt_regs *regs, int cpu, int handled) @@ -1004,9 +1067,7 @@ static void ipmi_new_smi(int if_num, str static void ipmi_smi_gone(int if_num) { - /* This can never be called, because once the watchdog is - registered, the interface can't go away until the watchdog - is unregistered. */ + ipmi_unregister_watchdog(if_num); } static struct ipmi_smi_watcher smi_watcher = @@ -1148,30 +1209,32 @@ static int __init ipmi_wdog_init(void) check_parms(); + register_reboot_notifier(&wdog_reboot_notifier); + atomic_notifier_chain_register(&panic_notifier_list, + &wdog_panic_notifier); + rv = ipmi_smi_watcher_register(&smi_watcher); if (rv) { #ifdef HAVE_NMI_HANDLER if (preaction_val == WDOG_PRETIMEOUT_NMI) release_nmi(&ipmi_nmi_handler); #endif + atomic_notifier_chain_unregister(&panic_notifier_list, + &wdog_panic_notifier); + unregister_reboot_notifier(&wdog_reboot_notifier); printk(KERN_WARNING PFX "can't register smi watcher\n"); return rv; } - register_reboot_notifier(&wdog_reboot_notifier); - atomic_notifier_chain_register(&panic_notifier_list, - &wdog_panic_notifier); - printk(KERN_INFO PFX "driver initialized\n"); return 0; } -static __exit void ipmi_unregister_watchdog(void) +static void __exit ipmi_wdog_exit(void) { - int rv; - - down_write(®ister_sem); + ipmi_smi_watcher_unregister(&smi_watcher); + ipmi_unregister_watchdog(watchdog_ifnum); #ifdef HAVE_NMI_HANDLER if (nmi_handler_registered) @@ -1179,37 +1242,8 @@ static __exit void ipmi_unregister_watch #endif atomic_notifier_chain_unregister(&panic_notifier_list, - &wdog_panic_notifier); + &wdog_panic_notifier); unregister_reboot_notifier(&wdog_reboot_notifier); - - if (! watchdog_user) - goto out; - - /* Make sure no one can call us any more. */ - misc_deregister(&ipmi_wdog_miscdev); - - /* Wait to make sure the message makes it out. The lower layer has - pointers to our buffers, we want to make sure they are done before - we release our memory. */ - while (atomic_read(&set_timeout_tofree)) - schedule_timeout_uninterruptible(1); - - /* Disconnect from IPMI. */ - rv = ipmi_destroy_user(watchdog_user); - if (rv) { - printk(KERN_WARNING PFX "error unlinking from IPMI: %d\n", - rv); - } - watchdog_user = NULL; - - out: - up_write(®ister_sem); -} - -static void __exit ipmi_wdog_exit(void) -{ - ipmi_smi_watcher_unregister(&smi_watcher); - ipmi_unregister_watchdog(); } module_exit(ipmi_wdog_exit); module_init(ipmi_wdog_init); diff -Naurp linux-2.6.18.noarch/include/linux/ipmi.h linux-ipmi/include/linux/ipmi.h --- linux-2.6.18.noarch/include/linux/ipmi.h 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/include/linux/ipmi.h 2007-06-11 15:23:49.000000000 -0400 @@ -148,6 +148,13 @@ struct ipmi_lan_addr #define IPMI_BMC_CHANNEL 0xf #define IPMI_NUM_CHANNELS 0x10 +/* + * Used to signify an "all channel" bitmask. This is more than the + * actual number of channels because this is used in userland and + * will cover us if the number of channels is extended. + */ +#define IPMI_CHAN_ALL (~0) + /* * A raw IPMI message without any addressing. This covers both @@ -201,6 +208,15 @@ struct kernel_ipmi_msg code as the first byte of the incoming data, unlike a response. */ +/* + * Modes for ipmi_set_maint_mode() and the userland IOCTL. The AUTO + * setting is the default and means it will be set on certain + * commands. Hard setting it on and off will override automatic + * operation. + */ +#define IPMI_MAINTENANCE_MODE_AUTO 0 +#define IPMI_MAINTENANCE_MODE_OFF 1 +#define IPMI_MAINTENANCE_MODE_ON 2 #ifdef __KERNEL__ @@ -350,18 +366,50 @@ int ipmi_request_supply_msgs(ipmi_user_t /* * When commands come in to the SMS, the user can register to receive - * them. Only one user can be listening on a specific netfn/cmd pair + * them. Only one user can be listening on a specific netfn/cmd/chan tuple * at a time, you will get an EBUSY error if the command is already * registered. If a command is received that does not have a user * registered, the driver will automatically return the proper - * error. + * error. Channels are specified as a bitfield, use IPMI_CHAN_ALL to + * mean all channels. */ int ipmi_register_for_cmd(ipmi_user_t user, unsigned char netfn, - unsigned char cmd); + unsigned char cmd, + unsigned int chans); int ipmi_unregister_for_cmd(ipmi_user_t user, unsigned char netfn, - unsigned char cmd); + unsigned char cmd, + unsigned int chans); + +/* + * Go into a mode where the driver will not autonomously attempt to do + * things with the interface. It will still respond to attentions and + * interrupts, and it will expect that commands will complete. It + * will not automatcially check for flags, events, or things of that + * nature. + * + * This is primarily used for firmware upgrades. The idea is that + * when you go into firmware upgrade mode, you do this operation + * and the driver will not attempt to do anything but what you tell + * it or what the BMC asks for. + * + * Note that if you send a command that resets the BMC, the driver + * will still expect a response from that command. So the BMC should + * reset itself *after* the response is sent. Resetting before the + * response is just silly. + * + * If in auto maintenance mode, the driver will automatically go into + * maintenance mode for 30 seconds if it sees a cold reset, a warm + * reset, or a firmware NetFN. This means that code that uses only + * firmware NetFN commands to do upgrades will work automatically + * without change, assuming it sends a message every 30 seconds or + * less. + * + * See the IPMI_MAINTENANCE_MODE_xxx defines for what the mode means. + */ +int ipmi_get_maintenance_mode(ipmi_user_t user); +int ipmi_set_maintenance_mode(ipmi_user_t user, int mode); /* * Allow run-to-completion mode to be set for the interface of @@ -571,6 +619,36 @@ struct ipmi_cmdspec #define IPMICTL_UNREGISTER_FOR_CMD _IOR(IPMI_IOC_MAGIC, 15, \ struct ipmi_cmdspec) +/* + * Register to get commands from other entities on specific channels. + * This way, you can only listen on specific channels, or have messages + * from some channels go to one place and other channels to someplace + * else. The chans field is a bitmask, (1 << channel) for each channel. + * It may be IPMI_CHAN_ALL for all channels. + */ +struct ipmi_cmdspec_chans +{ + unsigned int netfn; + unsigned int cmd; + unsigned int chans; +}; + +/* + * Register to receive a specific command on specific channels. error values: + * - EFAULT - an address supplied was invalid. + * - EBUSY - One of the netfn/cmd/chans supplied was already in use. + * - ENOMEM - could not allocate memory for the entry. + */ +#define IPMICTL_REGISTER_FOR_CMD_CHANS _IOR(IPMI_IOC_MAGIC, 28, \ + struct ipmi_cmdspec_chans) +/* + * Unregister some netfn/cmd/chans. error values: + * - EFAULT - an address supplied was invalid. + * - ENOENT - None of the netfn/cmd/chans were found registered for this user. + */ +#define IPMICTL_UNREGISTER_FOR_CMD_CHANS _IOR(IPMI_IOC_MAGIC, 29, \ + struct ipmi_cmdspec_chans) + /* * Set whether this interface receives events. Note that the first * user registered for events will get all pending events for the @@ -616,4 +694,11 @@ struct ipmi_timing_parms #define IPMICTL_GET_TIMING_PARMS_CMD _IOR(IPMI_IOC_MAGIC, 23, \ struct ipmi_timing_parms) +/* + * Set the maintenance mode. See ipmi_set_maintenance_mode() above + * for a description of what this does. + */ +#define IPMICTL_GET_MAINTENANCE_MODE_CMD _IOR(IPMI_IOC_MAGIC, 30, int) +#define IPMICTL_SET_MAINTENANCE_MODE_CMD _IOW(IPMI_IOC_MAGIC, 31, int) + #endif /* __LINUX_IPMI_H */ diff -Naurp linux-2.6.18.noarch/include/linux/ipmi_msgdefs.h linux-ipmi/include/linux/ipmi_msgdefs.h --- linux-2.6.18.noarch/include/linux/ipmi_msgdefs.h 2006-09-19 23:42:06.000000000 -0400 +++ linux-ipmi/include/linux/ipmi_msgdefs.h 2007-06-11 15:23:49.000000000 -0400 @@ -46,6 +46,8 @@ #define IPMI_NETFN_APP_REQUEST 0x06 #define IPMI_NETFN_APP_RESPONSE 0x07 #define IPMI_GET_DEVICE_ID_CMD 0x01 +#define IPMI_COLD_RESET_CMD 0x02 +#define IPMI_WARM_RESET_CMD 0x03 #define IPMI_CLEAR_MSG_FLAGS_CMD 0x30 #define IPMI_GET_DEVICE_GUID_CMD 0x08 #define IPMI_GET_MSG_FLAGS_CMD 0x31 @@ -60,21 +62,30 @@ #define IPMI_NETFN_STORAGE_RESPONSE 0x0b #define IPMI_ADD_SEL_ENTRY_CMD 0x44 +#define IPMI_NETFN_FIRMWARE_REQUEST 0x08 +#define IPMI_NETFN_FIRMWARE_RESPONSE 0x09 + /* The default slave address */ #define IPMI_BMC_SLAVE_ADDR 0x20 /* The BT interface on high-end HP systems supports up to 255 bytes in * one transfer. Its "virtual" BMC supports some commands that are longer * than 128 bytes. Use the full 256, plus NetFn/LUN, Cmd, cCode, plus - * some overhead. It would be nice to base this on the "BT Capabilities" - * but that's too hard to propagate to the rest of the driver. */ + * some overhead; it's not worth the effort to dynamically size this based + * on the results of the "Get BT Capabilities" command. */ #define IPMI_MAX_MSG_LENGTH 272 /* multiple of 16 */ #define IPMI_CC_NO_ERROR 0x00 #define IPMI_NODE_BUSY_ERR 0xc0 #define IPMI_INVALID_COMMAND_ERR 0xc1 +#define IPMI_TIMEOUT_ERR 0xc3 #define IPMI_ERR_MSG_TRUNCATED 0xc6 +#define IPMI_REQ_LEN_INVALID_ERR 0xc7 +#define IPMI_REQ_LEN_EXCEEDED_ERR 0xc8 +#define IPMI_NOT_IN_MY_STATE_ERR 0xd5 /* IPMI 2.0 */ #define IPMI_LOST_ARBITRATION_ERR 0x81 +#define IPMI_BUS_ERR 0x82 +#define IPMI_NAK_ON_WRITE_ERR 0x83 #define IPMI_ERR_UNSPECIFIED 0xff #define IPMI_CHANNEL_PROTOCOL_IPMB 1 diff -Naurp linux-2.6.18.noarch/include/linux/ipmi_smi.h linux-ipmi/include/linux/ipmi_smi.h --- linux-2.6.18.noarch/include/linux/ipmi_smi.h 2007-06-11 15:26:45.000000000 -0400 +++ linux-ipmi/include/linux/ipmi_smi.h 2007-06-11 15:23:49.000000000 -0400 @@ -115,6 +115,13 @@ struct ipmi_smi_handlers poll for operations during things like crash dumps. */ void (*poll)(void *send_info); + /* Enable/disable firmware maintenance mode. Note that this + is *not* the modes defined, this is simply an on/off + setting. The message handler does the mode handling. Note + that this is called from interupt context, so it cannot + block. */ + void (*set_maintenance_mode)(void *send_info, int enable); + /* Tell the handler that we are using it/not using it. The message handler get the modules that this handler belongs to; this function lets the SMI claim any modules that it diff -Naurp linux-2.6.18.noarch/include/linux/list.h linux-ipmi/include/linux/list.h --- linux-2.6.18.noarch/include/linux/list.h 2007-06-11 15:26:37.000000000 -0400 +++ linux-ipmi/include/linux/list.h 2007-06-11 15:23:49.000000000 -0400 @@ -360,6 +360,62 @@ static inline void list_splice_init(stru } /** + * list_splice_init_rcu - splice an RCU-protected list into an existing list. + * @list: the RCU-protected list to splice + * @head: the place in the list to splice the first list into + * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... + * + * @head can be RCU-read traversed concurrently with this function. + * + * Note that this function blocks. + * + * Important note: the caller must take whatever action is necessary to + * prevent any other updates to @head. In principle, it is possible + * to modify the list as soon as sync() begins execution. + * If this sort of thing becomes necessary, an alternative version + * based on call_rcu() could be created. But only if -really- + * needed -- there is no shortage of RCU API members. + */ +static inline void list_splice_init_rcu(struct list_head *list, + struct list_head *head, + void (*sync)(void)) +{ + struct list_head *first = list->next; + struct list_head *last = list->prev; + struct list_head *at = head->next; + + if (list_empty(head)) + return; + + /* "first" and "last" tracking list, so initialize it. */ + + INIT_LIST_HEAD(list); + + /* + * At this point, the list body still points to the source list. + * Wait for any readers to finish using the list before splicing + * the list body into the new list. Any new readers will see + * an empty list. + */ + + sync(); + + /* + * Readers are finished with the source list, so perform splice. + * The order is important if the new list is global and accessible + * to concurrent RCU readers. Note that RCU readers are not + * permitted to traverse the prev pointers without excluding + * this function. + */ + + last->next = at; + smp_wmb(); + head->next = first; + first->prev = head; + at->prev = last; +} + +/** * list_entry - get the struct for this entry * @ptr: the &struct list_head pointer. * @type: the type of the struct this is embedded in.