[illumos-Developer] Ugh... xdr_{float,double} problems

Sun Jan 30 12:27:31 PST 2011

Yeah, basically I'm a dumbass and shouldn't try to bugfix during weeks
where I'm sleep deprived (my apologies to everyone for that).  Dr.
Grabowski was right

I see the problem, and was able to get access to a S10 (sparc & x86)
box to obtain a known good XDR bytestream (of floats and doubles) to
test against so the test can be more definitive.

I have a webrev of the fix here: http://cr.illumos.org/view/kc6gu36g/
The #pragma redefine_extname bit was a suggestion from Gordon (just to
help people porting to other platforms out).

I really need to run errands now but will be back later to post the
test code (short version, I take the known good stream from s10
x86/sparc, decode it, compare values, then encode the values to
another stream and compare vs. trying to 'guess' the correct values in
the previous test), and take care of the rest.

On Sat, Jan 29, 2011 at 10:05 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
> Any update on this?  I've not had a chance yet to backout your change..
> I'm not sure what the right thing to do here is... I'd rather have a
> "real" fix.
>
>        - Garrett
>
> On Fri, 2011-01-28 at 12:10 -0600, Jason King wrote:
>> Ok, however things will still be broken, my fix just breaks (or more
>> precisely, breaks everyone's workarounds) faster.  I can probably have
>> a true fix later tonight as well.
>>
>> On Fri, Jan 28, 2011 at 10:58 AM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> > Ok, I'll backout the changes.  Can you please file a bug?  I probably won't
>> > get around to backing this change out until tonight as I'm traveling and on
>> > limited battery power.
>> >
>> >    - Garrett
>> >
>> > On 01/28/11 08:48 AM, Jason King wrote:
>> >>
>> >> After lengthy discussions, and even more digging... it looks like
>> >> there's a bit of a very-long standing mess with the C xdr API.
>> >>
>> >> The problem is, the OTW XDR format for 32&  64 bit doubles is
>> >> identical on x86, amd64, sparc, and sparcv9.  In theory, the code
>> >> should then just merely copy the values out unchanged.
>> >> However, that really happened was on little-endian platforms, the byte
>> >> values for fp values were being (incorrectly) byteswapped by the
>> >> library.  Instead of fixing the library, it appears the existing
>> >> software that actually uses fp with xdr workaround the bug on those
>> >> platforms.  Of course the recent putback broke the workaround.
>> >>
>> >> Essentially:
>> >>
>> >> 1. code is released with broken xdr_{float,double}
>> >> 2. code is written w/ workarounds
>> >> 3. code is updated, and __amd64 define is (incorrectly) changed to
>> >> amd64.  This causes xdr_{float,double} to revert to a slower, platform
>> >> independent version that correctly (minus rouding errors) writes the
>> >> value into the stream (without swapping).
>> >> 4. My fix correctly writes fp values unchanged to XDR stream, but also
>> >> breaks any software with workarounds.  The test cases were written to
>> >> the RFC&  man pages (since I couldn't really find anything out there
>> >> that uses
>> >>
>> >> Given the pervasiveness of this, we probably need to do something similar
>> >> to:
>> >>
>> >> 1. Revert the behavior to the traditional broken behavior
>> >> 2. Update the man pages to document this behavior
>> >> 3. Add a #define to the rpc/xdr.h header file (XDR_FLOAT_NO_SWAP ?) to
>> >> allow applications to select the correct behavior when defined (and
>> >> document this in the manpage).
>> >>
>> >> _______________________________________________
>> >> Developer mailing list
>> >> Developer at lists.illumos.org
>> >> http://lists.illumos.org/m/listinfo/developer
>> >>
>> >
>> >
>> > _______________________________________________
>> > Developer mailing list
>> > Developer at lists.illumos.org
>> > http://lists.illumos.org/m/listinfo/developer
>> >
>
>
>