[illumos-Developer] webrev - fix for 169 Lack of "rune" wctype breaks tr -C

Garrett D'Amore garrett at nexenta.com
Fri Sep 10 16:37:43 PDT 2010


Ah, yeah, I see what you're saying now.

I was confusing the encoded digit with the width.

If you look here:

usr/src/lib/libc/port/locale/_ctype.h

You'll see that the upper nybble is indeed used for character encoding.
This is used only for EUC locales (puke!).  

SO perhaps instead of -1, it would have been more correct to encode
0xffffff,  and I can fix that if its a big deal.

Its not a *runtime* problem though, because the only encodings that
would have non-zero values there would also be valid runes.

	- Garrett


On Sat, 2010-09-11 at 01:16 +0200, Joerg Schilling wrote:
> "Garrett D'Amore" <garrett at damore.org> wrote:
> 
> > They are not used anywhere in Solaris.
> >
> > We don't use BSD character width info in the same way, because we had
> > requirements to retain compatibility with Solaris' encoding of the ctype
> > data.  We're good here, but thanks for your concern.
> 
> Sure?
> 
> int
> wcwidth(wchar_t wc)
> {
>         unsigned int x;
> 
>         if (wc == 0)
>                 return (0);
> 
>         x = ((wc < 0 || wc >= _CACHED_RUNES) ? ___runetype(wc) :
>             _CurrentRuneLocale->__runetype[wc]) & (_CTYPE_SWM|_CTYPE_R);
> 
>         if ((x & _CTYPE_SWM) != 0)
>                 return ((x & _CTYPE_SWM) >> _CTYPE_SWS);
>         return ((x & _CTYPE_R) != 0 ? 1 : -1);
> }
> 
> looks af if it still uses the two top bits to code the character width.
> 
> Jörg
> 




More information about the Developer mailing list