[illumos-Developer] Request for Advice: Unicode/language expert opinions

Garrett D'Amore garrett at nexenta.com
Tue May 10 16:33:17 PDT 2011


I saw your note...  I'm fairly confident that yuris approach is actually more complete, and more correct, but we can check more carefully.  (Case folding is different and may include eg titlecase and mappings that are not one to one.)  I'm not worried about duplication, and I prefer to use the same localedef inputs rather than inventing some new mechanism just for these tables.

Yuri is also fixing other aspects of the ctype support this way.

Gordon Ross <gordon.w.ross at gmail.com> wrote:

>On Tue, May 10, 2011 at 5:41 PM, Garrett D'Amore <garrett at nexenta.com> wrote:
>> Case folding in Unicode is a bit different... its about creating a case insensitive match, which is not the same as going back and forth between cases.
>>
>> I think Yuris approach on this is sane.
>>
>
>Yes, I understand that case folding is different than identifying
>upper/lower pairs for ctype, but it's closely related for toupper
>and tolower.  The data I pointed to can be used for both.
>
>I haven't looked at Yuri's stuff yet.
>
>I'd like to focus on the requirements first, so we don't waste Yuri's time
>having him implement something that later turns in to (another) discussion
>about what the correct and desired functionality should be.
>
>In this case, the functional requirement I'd like to state is that this
>implementation should map all "C" and "S" upper/lower pairs in the
>case folding table.  I believe these are the ones where toupper and
>tolower should implement reversible conversions.  (Correct?)
>
>I'm slightly concerned that by compiling this data by extracting from
>several locale-specific UTF-8 files, we might easily miss some.
>If nothing else, the case folding table might be a way to check that
>we have not missed any.  Or it could be used for implementation.
>
>Did you see my note about the u8_textprep stuff we already have?
>Are we creating undesirable duplication with that?
>
>Gordon


More information about the Developer mailing list