• Final request for feedback

    From David Newall@davidn@davidnewall.com to comp.lang.postscript on Sun Feb 20 14:11:37 2022
    From Newsgroup: comp.lang.postscript

    Hi All,

    I'm about to publish my UTF-8 code. Before I do I'm asking for feedback
    and opinions for what should be the last time.

    What's different about what I'm finally intending to publish:

    1. I'm using a dictionary for the UNICODE encoding map, instead of
    sparse array. This isn't because it's faster -- 3ns slower seems quite acceptable -- and a dictionary is bigger -- over double the size for
    GNU's UnifontMedium. I'm doing this because it's two less files to
    publish -- I don't need to publish sparseget and I don't need to publish
    an AWK script to convert Fontforge .g2n files into a sparse array.

    2. I've replaced utf8show with utf8decode (which generates an array of
    UNICODE values) and unicodeshow.

    3. I'm not storing the map in the font, but passing it as a parameter to unicodeshow because I think it's simpler. Storing it in the font means defining a new font (definefont).

    These are the alternative programs for printing a UTF-8 string.

    This is what I think I'll publish:

    %!PS
    %%IncludeResource: procset unicodeshow
    %%IncludeResource: procset utf8decode
    /Helvetica 20 selectfont
    100 300 moveto
    (Welcome to \342\200\234UTF-8\342\200\235 \342\230\272) utf8decode
    ReverseAdobeGlyphList exch unicodeshow
    showpage

    This is what I was previously intending, using a dictionary:

    %!PS
    %%IncludeResource: procset unicodefont
    %%IncludeResource: procset unicodeshow
    %%IncludeResource: procset utf8decode
    /Helvetica findfont 20 scalefont ReverseAdobeGlyphList unicodefont
    /MyFont exch definefont setfont
    100 300 moveto
    (Welcome to \342\200\234UTF-8\342\200\235 \342\230\272) utf8decode
    unicodeshow
    showpage

    There's one extra line if using a sparse array instead of a dictionary:

    %%IncludeResource: procset sparseget

    I think the first is better but am open to opposing opinions.

    Thanks,

    David
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From luser droog@luser.droog@gmail.com to comp.lang.postscript on Tue Feb 22 07:36:22 2022
    From Newsgroup: comp.lang.postscript

    On Saturday, February 19, 2022 at 9:11:52 PM UTC-6, David Newall wrote:
    Hi All,

    I'm about to publish my UTF-8 code. Before I do I'm asking for feedback
    and opinions for what should be the last time.

    That looks really good to me. I'm a little sad that definefont is out,
    but it really doesn't appear to offer very much. It seems like PostScript *almost* has the pieces available to put this together seamlessly.
    But the conversion probably can't use a filtered file because of the need
    to convert from a string to an array. And packing the glyph selection
    into a composite font would be a ton of work if it's even possible.
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Carlos@carlos@cvkm.cz to comp.lang.postscript on Sat Feb 26 01:56:15 2022
    From Newsgroup: comp.lang.postscript

    On Sun, 20 Feb 2022 14:11:37 +1100
    David Newall <davidn@davidnewall.com> wrote:
    [...]
    3. I'm not storing the map in the font, but passing it as a parameter
    to unicodeshow because I think it's simpler. Storing it in the font
    means defining a new font (definefont).

    I think the map problem --how to get a good map, since the AdobeGlyphMap
    is insufficient-- is the key. The interface and/or implementation IMO
    is not so important (I posted an alternative implementation in another message--but it's still limited to the meager 4K+ glyphs in the Adobe
    list plus whatever extra /uniXXXX the font has...).

    C.
    --

    --- Synchronet 3.21d-Linux NewsLink 1.2