[illumos-Developer] Important - time sensitive: Drive failures and infinite waits
Garrett D'Amore
garrett at damore.org
Fri May 27 12:14:20 PDT 2011
It does. One type of problem is a drive that does not hard fail but manages to limp along doing a request or two per second. We dont have a good defense for this at present. (Internal retry logic in the drives make this harder too.)
-- Garrett D'Amore
On May 27, 2011, at 4:33 PM, Gary Mills <mills at cc.umanitoba.ca> wrote:
> On Fri, May 27, 2011 at 02:04:42AM +0100, Alasdair Lumsden wrote:
>>
>> Gordon Ross and George Wilson were kind enough to do some extensive
>> rummaging around prior to the reboot, and with some input from Eric
>> Schrock, it sounds like the issue was a phy lock due to an ASIC
>> fault in the LSI 1068 present on the cards when used with SATA
>> drives specifically.
>
> Perhaps we need a software watchdog to protect against hardware
> failures of that sort? Doesn't the SCSI driver time out and do a bus
> reset when the target doesn't respond?
>
> --
> -Gary Mills- -Unix Group- -Computer and Network Services-
>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer
More information about the Developer
mailing list