[illumos-Developer] Important - time sensitive: Drive failures and infinite waits

Garrett D'Amore garrett at damore.org
Fri May 27 12:14:20 PDT 2011


It does.  One type of problem is a drive that does not hard fail but manages to limp along doing a request or two per second.  We dont have a good defense for this at present.  (Internal retry logic in the drives make this harder too.)

  -- Garrett D'Amore

On May 27, 2011, at 4:33 PM, Gary Mills <mills at cc.umanitoba.ca> wrote:

> On Fri, May 27, 2011 at 02:04:42AM +0100, Alasdair Lumsden wrote:
>> 
>> Gordon Ross and George Wilson were kind enough to do some extensive
>> rummaging around prior to the reboot, and with some input from Eric
>> Schrock, it sounds like the issue was a phy lock due to an ASIC
>> fault in the LSI 1068 present on the cards when used with SATA
>> drives specifically.
> 
> Perhaps we need a software watchdog to protect against hardware
> failures of that sort?  Doesn't the SCSI driver time out and do a bus
> reset when the target doesn't respond?
> 
> -- 
> -Gary Mills-        -Unix Group-        -Computer and Network Services-
> 
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer



More information about the Developer mailing list