[illumos-Developer] Ugh... xdr_{float,double} problems
Jason King
jason.brian.king at gmail.com
Sun Jan 30 15:34:02 PST 2011
Here's the outgoing & pbchk (after reci). New webrev (content is the
same minus commit comments): http://cr.illumos.org/view/o3kn9bdi/
% hg outgoing -v ssh://anonhg@illumos.org/illumos-gate
running ssh anonhg at illumos.org "hg -R illumos-gate serve --stdio"
comparing with ssh://anonhg@illumos.org/illumos-gate
searching for changes
changeset: 13277:e850490ea829
tag: tip
user: Jason King <jason.brian.king+illumoshg at gmail.com>
date: Sun Jan 30 17:20:34 2011 -0600
description:
166 CR6901979 error in xdr_float.c not fixed
modified:
usr/src/lib/libbc/inc/include/rpc/xdr.h
usr/src/lib/libnsl/rpc/xdr_float.c
% hg pbchk
Copyright check:
C style check:
Header format check:
Java style check:
Mapfile comment check:
File permission check:
Keywords check:
Comments check:
Checking for new tags:
Checking for multiple heads (or branches):
Checking for branch changes:
Checking for uncommitted changes:
Checking for merges:
On Sun, Jan 30, 2011 at 5:15 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
> Looks reasonable. Could you hg reci against what is currently in the
> tree, please?
>
> - Garrett
>
> On Sun, 2011-01-30 at 16:38 -0600, Jason King wrote:
>> On Sun, Jan 30, 2011 at 4:16 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> > On Sun, 2011-01-30 at 16:07 -0600, Jason King wrote:
>> >> One thing I did notice is that they are typically surrounded by
>> >> #ifdef __PRAGMA_REDEFINE_EXTNAME
>> >> #endif
>> >> blocks
>> >>
>> >> I will update the header file to match the existing convention.
>> >>
>> >> One other thing to note -- it appears the existing xdr.h doesn't pass
>> >> the cstyle checks (independent of my changes).
>> >>
>> >> Some of them I can clean up as it's just some spacing issues, but it
>> >> complains a lot about using 'u_long' vs. 'ulong_t' and similar such
>> >> differences with a few other integral types. I'm more hesitant to
>> >> change those without some additional guidance. I don't think they
>> >> would break anything by changing them. ISTR that C isn't _that_
>> >> strict about typing (so using ulong_t vs u_long shouldn't make a
>> >> difference), but it's one of those hairy areas that I'm not sure
>> >> about.
>> >
>> > You are not responsible for fixing pre-existing issues. Your new code
>> > should be clean, and if its easy to clean stuff, then you should. But
>> > not if its going to be really hairy and risky.
>> >
>> > That said, ulong == u_long == ulong_t where C is concerned. These are
>> > all typedef'd to "unsigned long", and you can change them as you need to
>> > to pass C style without any risk.
>>
>> It's small enough in number that I threw it in.
>>
>> http://cr.illumos.org/view/ka87apne/ has all the updates (sans hg recommit).
>>
>> I've just been doing a dmake install/dmake lint with the current fixes
>> (which all come back ok), so now with all the issues fixed, I've
>> kicked off a full nightly on x86 & sparc to be 100% sure everything's
>> good.
>>
>> >
>> >>
>> >> If not, I could add the /* {BEGIN,END} CSTYLED */ tags as an
>> >> alternative (or just ignore it altogether if everyone's happy with
>> >> that as well).
>> >
>> > No, we'll just waive the cstyle output in this case. Someday someone
>> > *should* go back and fix that.
>> >
>> > - Garrett
>> >
>> >>
>> >> On Sun, Jan 30, 2011 at 3:38 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> >> > If that particular pragma is so widely used, then lets just go ahead and
>> >> > keep it for now.
>> >> >
>> >> > - Garrett
>> >> >
>> >> > On Sun, 2011-01-30 at 15:33 -0600, Jason King wrote:
>> >> >> On Sun, Jan 30, 2011 at 2:52 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> >> >> > General review feedback:
>> >> >> >
>> >> >> > No spacing between # and CPP directives, please.
>> >> >>
>> >> >> Fixed.
>> >> >>
>> >> >> >
>> >> >> > I'm a bit worried about pragma redefine_extname -- what happens to that
>> >> >> > on gcc or Studio? Do they both support it?
>> >> >>
>> >> >> They should -- it's used extensively throughout the code already
>> >> >> (stdio, sockets, etc. about 150 times), studio definitely does, if gcc
>> >> >> doesn't there would be much bigger problems :)
>> >> >>
>> >> >> >
>> >> >> > I'm thinking #error might be better here.... it would force people
>> >> >> > working on alternate processor ports to adapt the code...
>> >> >>
>> >> >> In that case, should we just ditch the portable implementation altogether?
>> >> >>
>> >> >> >
>> >> >> > - Garrett
>> >> >> >
>> >> >> > On Sun, 2011-01-30 at 14:27 -0600, Jason King wrote:
>> >> >> >> Yeah, basically I'm a dumbass and shouldn't try to bugfix during weeks
>> >> >> >> where I'm sleep deprived (my apologies to everyone for that). Dr.
>> >> >> >> Grabowski was right
>> >> >> >>
>> >> >> >> I see the problem, and was able to get access to a S10 (sparc & x86)
>> >> >> >> box to obtain a known good XDR bytestream (of floats and doubles) to
>> >> >> >> test against so the test can be more definitive.
>> >> >> >>
>> >> >> >> I have a webrev of the fix here: http://cr.illumos.org/view/kc6gu36g/
>> >> >> >> The #pragma redefine_extname bit was a suggestion from Gordon (just to
>> >> >> >> help people porting to other platforms out).
>> >> >> >>
>> >> >> >> I really need to run errands now but will be back later to post the
>> >> >> >> test code (short version, I take the known good stream from s10
>> >> >> >> x86/sparc, decode it, compare values, then encode the values to
>> >> >> >> another stream and compare vs. trying to 'guess' the correct values in
>> >> >> >> the previous test), and take care of the rest.
>> >> >> >>
>> >> >> >> On Sat, Jan 29, 2011 at 10:05 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> >> >> >> > Any update on this? I've not had a chance yet to backout your change..
>> >> >> >> > I'm not sure what the right thing to do here is... I'd rather have a
>> >> >> >> > "real" fix.
>> >> >> >> >
>> >> >> >> > - Garrett
>> >> >> >> >
>> >> >> >> > On Fri, 2011-01-28 at 12:10 -0600, Jason King wrote:
>> >> >> >> >> Ok, however things will still be broken, my fix just breaks (or more
>> >> >> >> >> precisely, breaks everyone's workarounds) faster. I can probably have
>> >> >> >> >> a true fix later tonight as well.
>> >> >> >> >>
>> >> >> >> >> On Fri, Jan 28, 2011 at 10:58 AM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> >> >> >> >> > Ok, I'll backout the changes. Can you please file a bug? I probably won't
>> >> >> >> >> > get around to backing this change out until tonight as I'm traveling and on
>> >> >> >> >> > limited battery power.
>> >> >> >> >> >
>> >> >> >> >> > - Garrett
>> >> >> >> >> >
>> >> >> >> >> > On 01/28/11 08:48 AM, Jason King wrote:
>> >> >> >> >> >>
>> >> >> >> >> >> After lengthy discussions, and even more digging... it looks like
>> >> >> >> >> >> there's a bit of a very-long standing mess with the C xdr API.
>> >> >> >> >> >>
>> >> >> >> >> >> The problem is, the OTW XDR format for 32& 64 bit doubles is
>> >> >> >> >> >> identical on x86, amd64, sparc, and sparcv9. In theory, the code
>> >> >> >> >> >> should then just merely copy the values out unchanged.
>> >> >> >> >> >> However, that really happened was on little-endian platforms, the byte
>> >> >> >> >> >> values for fp values were being (incorrectly) byteswapped by the
>> >> >> >> >> >> library. Instead of fixing the library, it appears the existing
>> >> >> >> >> >> software that actually uses fp with xdr workaround the bug on those
>> >> >> >> >> >> platforms. Of course the recent putback broke the workaround.
>> >> >> >> >> >>
>> >> >> >> >> >> Essentially:
>> >> >> >> >> >>
>> >> >> >> >> >> 1. code is released with broken xdr_{float,double}
>> >> >> >> >> >> 2. code is written w/ workarounds
>> >> >> >> >> >> 3. code is updated, and __amd64 define is (incorrectly) changed to
>> >> >> >> >> >> amd64. This causes xdr_{float,double} to revert to a slower, platform
>> >> >> >> >> >> independent version that correctly (minus rouding errors) writes the
>> >> >> >> >> >> value into the stream (without swapping).
>> >> >> >> >> >> 4. My fix correctly writes fp values unchanged to XDR stream, but also
>> >> >> >> >> >> breaks any software with workarounds. The test cases were written to
>> >> >> >> >> >> the RFC& man pages (since I couldn't really find anything out there
>> >> >> >> >> >> that uses
>> >> >> >> >> >>
>> >> >> >> >> >> Given the pervasiveness of this, we probably need to do something similar
>> >> >> >> >> >> to:
>> >> >> >> >> >>
>> >> >> >> >> >> 1. Revert the behavior to the traditional broken behavior
>> >> >> >> >> >> 2. Update the man pages to document this behavior
>> >> >> >> >> >> 3. Add a #define to the rpc/xdr.h header file (XDR_FLOAT_NO_SWAP ?) to
>> >> >> >> >> >> allow applications to select the correct behavior when defined (and
>> >> >> >> >> >> document this in the manpage).
>> >> >> >> >> >>
>> >> >> >> >> >> _______________________________________________
>> >> >> >> >> >> Developer mailing list
>> >> >> >> >> >> Developer at lists.illumos.org
>> >> >> >> >> >> http://lists.illumos.org/m/listinfo/developer
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > _______________________________________________
>> >> >> >> >> > Developer mailing list
>> >> >> >> >> > Developer at lists.illumos.org
>> >> >> >> >> > http://lists.illumos.org/m/listinfo/developer
>> >> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >> >
>> >
>> >
>> >
>
>
>
More information about the Developer
mailing list