[illumos-Developer] illumos bug 453

Deano deano at rattie.demon.co.uk
Sun Jan 9 09:18:39 PST 2011


Hi George,
I've looked into the AHCI driver code (which was my first thought as well)
and AFAICT AHCI driver correctly reads the data from the drive as specified
by ATAPI-7 standard, on the drives I have to test (all SAMSUNG F4 2TB) it
doesn't look like it implements the ATAPI-7 API at least on the firmware I
have. The correct behavior if ATAPI-7 isn't exposed it to assume the default
of 512B that AHCI driver does, so I think its valid code (however without
the ability to test, hard to know for sure).

Internet reports indicate similar issues on other OSs on similar 4K drives
(but report 512B) so I'm inclined to believe the current driver code is
correct and it's a drive problem.

I'm happy to delve in further into the driver issues, once I have more types
of drives to test on, but for now...

I think the "correct" solution if I were coding at an OS company would be to
keep a table of drives that are lying or omitting the info in the AHCI
driver and return the correct size, so it is contained at the lowest levels.
But that's hard to do as an open source project, as I can't just order all
the 2 or 3TB drives out there to build the workaround table. This way users
can at least correct the issues as required.

Bye,
Deano

-----Original Message-----
From: George Wilson [mailto:gwilson at zfsmail.com] 
Sent: 09 January 2011 17:00
To: developer at lists.illumos.org
Subject: Re: [illumos-Developer] illumos bug 453

Deano,

First of all, it's great that someone is looking into this but I was 
wondering if anybody has looked at why the ioctl() below is not giving 
us the correct information?

         /*
          * Determine the device's minimum transfer size.
          * If the ioctl isn't supported, assume DEV_BSIZE.
          */
         if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFOEXT, (intptr_t)&dkmext,
             FKIOCTL, kcred, NULL) != 0)
                 dkmext.dki_pbsize = DEV_BSIZE;

         *ashift = highbit(MAX(dkmext.dki_pbsize, SPA_MINBLOCKSIZE)) - 1;

I'm not a big fan of exposing an implementation detail, like 
vdev_ashift, to users as it makes it more difficult to remove later. I'm 
happy to review your code but would first like to make sure we're making 
the right design decision.

If we're after a quick and dirty way to allow users to set their ashift 
because they have some funky drives then we could also provide an 
override via a kernel tunable. This buy us some time and allow us to 
solve the underlying problem with the code above.

Thanks,
George




On 1/9/11 8:24 AM, Deano wrote:
> Hi,
>
> I have a fix for bug 453 ready to be checked in.
>
> Its adds an optional parameter to zpool, such that you can specify the
drive
> block size, normally this isn't needed as ZFS reads it from the drive
itself
> but recently a number of SATA 4KiB blocks drives have appeared (for>= 2TB
> disks), that either lie or don't bother to export the correct size, which
> can cause performance issues as they are classed as 512B drives by
default.
> It is purely a tool level fix, as ASHIFT setting (which it ultimately is)
is
> already supported fully by ZFS properties just not exposed to the zpool
> tool.
>
> zpool create block-size 4096 dsk1 dsk2
>
> Which will set the block-size for *all* following drives on the command
> line. block-size 0 will reset to hw detect, and setting block-size again
is
> allowed per pool, so that different pools can have different block sizes.
>
> Been reviewed on IRC the other night (I forgot to get the emails of the
> reviewers so just have the IRC names (Triskelios, gisburn, jbk), if I can
> get the emails I can give proper credit).
>
> Passes hg nit except for copyright year warning, left that because not
sure
> how illumos as a whole wants to handle copyright update notices.
Personally
> I don't mind doing it in my name or assigning to illumos or ??
>
> Webrev with results of code reviews and hg nit fixes can be found at
> http://cloudpixies.com/webrev/bug453
>
> Any problems, just say and I'll get them fixed or if okay would be good to
> get it into the code base :)
>
> Thanks,
> Deano
>
> deano at cloudpixies.com
>
>
>
>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer


_______________________________________________
Developer mailing list
Developer at lists.illumos.org
http://lists.illumos.org/m/listinfo/developer




More information about the Developer mailing list