• PSA: HTML fragment mode interaction between Chromium, Clipboard & Notepad++

    From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 07:17:37 2026
    From Newsgroup: alt.comp.os.windows-10

    PSA: HTML fragment mode interaction between Chromium, Clipboard & Notepad++

    There is a confusing interaction between Chromium-based applications when copying to the Windows clipboard and then pasting into Notepad++ that can
    break basic editing functions like Select All (Ctrl+A).

    What happens is the following sequence:
    a. You copy Chromium-based text (Ctrl+C) to the Windows clipboard
    b. You paste that into Notepad++
    c. You try to select all (Ctrl+A)
    What happens is the selection flashes, but nothing is selected
    If you add a blank line at the top, then Ctrl+A works as expected.

    It took me a long time to track this very repeatable artifact down, so I am sharing the explanation here for anyone else who runs into it.

    When you copy text from any Chromium-based application, Apparently Windows
    does not just copy plain text. Windows actually puts two versions of the
    text on the Windows clipboard:
    1. a normal plain-text version
    2. a hidden HTML-formatted version

    This apparently happens with Chrome, Edge, Brave, Vivaldi, and many
    Electron apps such as Slack, Discord, and VS Code.

    When we paste that clipboard data into Notepad++, Notepad++ sees the hidden HTML version and Notepad++ assumes the paste is part of a larger HTML
    fragment. That puts Notepad++ into a strange internal state, which is apparently sometimes called "HTML fragment mode".

    In this mode:
    a. Ctrl+A will not select the whole document
    b. Therefore, Ctrl+X will not cut anything
    c. Because the Ctrl+A selections are behaving incorrectly

    What took me a while to figure this out was that this artifact has nothing
    to do with the text itself so a hex editor does not show the artifact.

    It is simply how Notepad++ reacts to the Windows clipboard format used by Chromium-based applications.

    The workaround I've come up with is surprisingly simple:
    a. Insert an extraneous leading blank line at the top of the document
    b. Now when you press Ctrl+A, the selection works as expected
    c. Copy (Ctrl+C) or Cut (Ctrl+V) the selected text, as desired
    d. Then delete the extraneous leading blank line

    That tiny change of the extraneous blank line apparently forces Notepad++
    to abandon HTML fragment mode and treat the text as normal again.

    Q: Why does deleting the blank line work around this issue successfully?
    A: Because the hidden HTML fragment boundary is treated as if it were
    the first line of the document. Deleting the first line removes that
    invisible boundary and resets the buffer.

    Summary:
    A. Chromium apps (apparently) copy hidden HTML to the clipboard.
    B. Notepad++ sees that artifact and enters HTML fragment mode.
    C. There is no indication whatsoever you're in HTML fragment mode!
    D. But in that mode, Ctrl+A stops working correctly.
    E. Yet, inserting a blank line resets Notepad++ back to normal.

    If you have ever pasted something into Notepad++ and wondered why you
    suddenly cannot select or cut the text, this artifact might just be why.
    --
    Every Usenet post should strive to add palpable additional value
    so that we can all delight in dissemination of useful knowledge.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 07:32:36 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:
    The workaround I've come up with is surprisingly simple:
    a. Insert an extraneous leading blank line at the top of the document
    b. Now when you press Ctrl+A, the selection works as expected
    c. Copy (Ctrl+C) or Cut (Ctrl+V) the selected text, as desired
    d. Then delete the extraneous leading blank line

    Ooops...
    I had meant "Ctrl+X" above in item (c).
    Working as a team, can you let me know if you can reproduce this easily?

    1. Open Microsoft Edge (or any Chromium-based web browser).
    2. Go to any web page, for example:
    Win+R > msedge https://en.wikipedia.org/wiki/Quotation_mark
    3. Select any text on the page (e.g., Ctrl+A)
    4. Press Ctrl+C to copy whatever text you had selected
    5. Open Notepad++
    6. Press Ctrl+V to paste the text
    7. Now press Ctrl+A
    Huh? You will see the selection flash, but nothing is selected.

    This is the "HTML fragment mode" issue described above.

    Apparently, this artifact happens because Chromium puts hidden HTML on the clipboard, and Notepad++ reacts to that by entering a special mode where
    Ctrl+A and Ctrl+X do not work.

    Please try it because maybe it's just me, but I've been battling this
    artifact for a long time because it's completely hidden in a hex editor.

    One false trail that sent me off the scent was that I had originally
    suspected this problem was related to the BOM indicator in the
    bottom-right corner of Notepad++. It is not. The BOM only tells us what encoding Notepad++ will use when saving the file. It does not affect how
    pasted text behaves.

    The HTML fragment mode problem happens even when:
    a. the file is brand new
    b. the file has never been saved
    c. the encoding is "ANSI" or "UTF-8" with no BOM
    d. there is no BOM in the file at all

    This artifact is caused entirely by the clipboard data format that Chromium applications place on the Windows clipboard. Notepad++ sees the hidden HTML version and enters HTML fragment mode. The BOM is not involved in any way.

    I've always said I never quite understood all this character set stuff...
    --
    The nice thing about Usenet is you get good ideas from everyone.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Carlos E. R.@robin_listas@es.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 13:33:44 2026
    From Newsgroup: alt.comp.os.windows-10

    On 2026-02-10 13:17, Maria Sophia wrote:

    ...

    When you copy text from any Chromium-based application, Apparently Windows does not just copy plain text. Windows actually puts two versions of the
    text on the Windows clipboard:
    1. a normal plain-text version
    2. a hidden HTML-formatted version

    This is a very old Windows feature. It is up to the receiving
    application to choose what version to paste.

    With a clipboard manager perhaps you can choose the version yourself.
    --
    Cheers,
    Carlos E.R.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 07:44:52 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote:
    When you copy text from any Chromium-based application, Apparently Windows >> does not just copy plain text. Windows actually puts two versions of the
    text on the Windows clipboard:
    1. a normal plain-text version
    2. a hidden HTML-formatted version

    This is a very old Windows feature. It is up to the receiving
    application to choose what version to paste.

    With a clipboard manager perhaps you can choose the version yourself.

    Hi Carlos,

    Thanks for pointing that out as this artifact has been biting me because I really never understood how the characters are stored in the clipboard.

    I believe that you are absolutely right that Windows has supported multiple clipboard formats for a long time and that applications are free to
    offer several representations of the same data. That part is completely
    normal, and your reminder helps me double-check that angle while I was debugging this.

    The part that took me a while to uncover is what Chromium-based
    applications actually place on the clipboard. They do not just provide
    plain text. They also provide a CF_HTML fragment with StartFragment and EndFragment markers. That HTML fragment is what Notepad++ ends up choosing
    when pasting.

    Once Notepad++ sees that HTML fragment, it switches into what apparently
    some people call "HTML fragment mode". In that mode, the editor treats the first line of the buffer as a fragment boundary. That boundary is invisible
    in the editor, but it affects selection logic.

    As a result:
    a. Ctrl+A flashes but does not select the entire document
    b. Hence, Ctrl+X does nothing because selections behave incorrectly

    This happens even in a brand-new, unsaved buffer with no BOM and no
    encoding issues. Surprisingly, the Notepad++ hex editor will not show
    anything unusual because the artifact is not in the pasted text itself.

    The artifact is actually in the clipboard metadata that Chromium adds.

    The workaround (inserting and then deleting a blank line at the top) forces Notepad++ to abandon HTML fragment mode and treat the buffer as plain text again. After that, Ctrl+A and Ctrl+X work normally. Ask me how I know this.

    So yes, Windows is behaving as designed by offering multiple clipboard
    formats and your point about applications choosing which format to paste is correct. The unexpected part for me is that Notepad++ selects the HTML
    flavor from Chromium and then Notepad++ enters a special mode that breaks normal selection behavior.

    Your comment helped me frame the explanation more clearly, so thanks for
    that useful added value, which is why we're all here on Usenet. To learn.
    --
    Every Usenet post should strive to add palpable additional value
    so that we can all delight in dissemination of useful knowledge.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Andy Burns@usenet@andyburns.uk to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 12:46:46 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:

    When you copy text from any Chromium-based application, Apparently Windows does not just copy plain text. Windows actually puts two versions of the
    text on the Windows clipboard:
    1. a normal plain-text version
    2. a hidden HTML-formatted version

    Not uncommon for many apps to make multiple formats avaailable on the clipboard.

    When we paste that clipboard data into Notepad++, Notepad++ sees the hidden HTML version and Notepad++ assumes the paste is part of a larger HTML fragment. That puts Notepad++ into a strange internal state, which is apparently sometimes called "HTML fragment mode".

    I think notepad++ has a "paste special" command, does that let you pick
    the plain text version?

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 07:55:44 2026
    From Newsgroup: alt.comp.os.windows-10

    On Tue, 2/10/2026 7:17 AM, Maria Sophia wrote:
    PSA: HTML fragment mode interaction between Chromium, Clipboard & Notepad++

    There is a confusing interaction between Chromium-based applications when copying to the Windows clipboard and then pasting into Notepad++ that can break basic editing functions like Select All (Ctrl+A).

    What happens is the following sequence:
    a. You copy Chromium-based text (Ctrl+C) to the Windows clipboard
    b. You paste that into Notepad++
    c. You try to select all (Ctrl+A)
    What happens is the selection flashes, but nothing is selected
    If you add a blank line at the top, then Ctrl+A works as expected.

    It took me a long time to track this very repeatable artifact down, so I am sharing the explanation here for anyone else who runs into it.

    When you copy text from any Chromium-based application, Apparently Windows does not just copy plain text. Windows actually puts two versions of the
    text on the Windows clipboard:
    1. a normal plain-text version
    2. a hidden HTML-formatted version

    This apparently happens with Chrome, Edge, Brave, Vivaldi, and many
    Electron apps such as Slack, Discord, and VS Code.

    When we paste that clipboard data into Notepad++, Notepad++ sees the hidden HTML version and Notepad++ assumes the paste is part of a larger HTML fragment. That puts Notepad++ into a strange internal state, which is apparently sometimes called "HTML fragment mode".

    In this mode:
    a. Ctrl+A will not select the whole document
    b. Therefore, Ctrl+X will not cut anything
    c. Because the Ctrl+A selections are behaving incorrectly

    What took me a while to figure this out was that this artifact has nothing
    to do with the text itself so a hex editor does not show the artifact.

    It is simply how Notepad++ reacts to the Windows clipboard format used by Chromium-based applications.

    The workaround I've come up with is surprisingly simple: a. Insert an extraneous leading blank line at the top of the document
    b. Now when you press Ctrl+A, the selection works as expected
    c. Copy (Ctrl+C) or Cut (Ctrl+V) the selected text, as desired
    d. Then delete the extraneous leading blank line

    That tiny change of the extraneous blank line apparently forces Notepad++
    to abandon HTML fragment mode and treat the text as normal again.
    Q: Why does deleting the blank line work around this issue successfully? A: Because the hidden HTML fragment boundary is treated as if it were
    -a the first line of the document. Deleting the first line removes that
    -a invisible boundary and resets the buffer.

    Summary:
    A. Chromium apps (apparently) copy hidden HTML to the clipboard.
    B. Notepad++ sees that artifact and enters HTML fragment mode.
    C. There is no indication whatsoever you're in HTML fragment mode!
    D. But in that mode, Ctrl+A stops working correctly.
    E. Yet, inserting a blank line resets Notepad++ back to normal.

    If you have ever pasted something into Notepad++ and wondered why you suddenly cannot select or cut the text, this artifact might just be why.

    As I understand it, the Clipboard can have multiple representations
    on it at one time. Sometimes a person pastes into Notepad.exe and
    then copies a text again, as a means of "cleaning" any multi-item
    clipboards.

    What you need in hand, is a clipboard viewer, to see what is on offer
    and what Notepad++ may have accessed as its choice. So far, I have not
    spotted the "perfect" chunk of code for this.

    *******

    https://stackoverflow.com/questions/35827764/how-to-know-the-type-of-data-in-clipboard-through-python

    import win32clipboard as clipboard # Example in Python, version unknown
    def getTheClipboardType():
    formats = []
    clipboard.OpenClipboard()
    lastFormat = 0
    while True:
    nextFormat = clipboard.EnumClipboardFormats(lastFormat)
    if 0 == nextFormat:
    # all done -- get out of the loop
    break
    else:
    formats.append(nextFormat)
    lastFormat = nextFormat
    clipboard.CloseClipboard()
    return formats

    Example output:
    [13, 1, 49427, 49953, 49422, 49304, 16, 7]

    *******

    Whereas the "Windows way" is to "ask" for a format, without
    enumerating what formats are available.

    *******

    Here is the "Hah!" moment.

    https://learn.microsoft.com/en-us/windows/win32/dataxchg/clipboard-formats

    "A window can place more than one object on the clipboard, each
    representing the same information in a different clipboard format.

    Users need not be aware of the clipboard formats used for an object on the clipboard. <=== Hah!
    "

    "To find out how many formats are currently used on the clipboard,
    call the CountClipboardFormats function."

    There is your problem definition.

    If you "copy" three files in Explorer file manager, then
    three items of "filesystem" type or so, are on there. Notepad++
    should not respond to such a clipboard format. But it is possible
    to extract the text strings from that.

    Paul
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Carlos E. R.@robin_listas@es.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 14:33:49 2026
    From Newsgroup: alt.comp.os.windows-10

    On 2026-02-10 13:55, Paul wrote:
    On Tue, 2/10/2026 7:17 AM, Maria Sophia wrote:
    PSA: HTML fragment mode interaction between Chromium, Clipboard & Notepad++


    As I understand it, the Clipboard can have multiple representations
    on it at one time. Sometimes a person pastes into Notepad.exe and
    then copies a text again, as a means of "cleaning" any multi-item
    clipboards.

    I think it goes back to Windows 3, I remember reading technical
    documents about it. I also think that the source application can put the metadata on the clipboard, but delays putting the actual data till the
    client actually requests it. Imagine pasting a picture in several large formats, it is expensive to store all format choices there.


    What you need in hand, is a clipboard viewer, to see what is on offer
    and what Notepad++ may have accessed as its choice. So far, I have not spotted the "perfect" chunk of code for this.


    Ah, no perfect clipboard app yet?
    --
    Cheers,
    Carlos E.R.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 09:07:03 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:

    When you copy text from any Chromium-based application, Apparently Windows >> does not just copy plain text. Windows actually puts two versions of the
    text on the Windows clipboard:
    1. a normal plain-text version
    2. a hidden HTML-formatted version

    Not uncommon for many apps to make multiple formats avaailable on the clipboard.

    When we paste that clipboard data into Notepad++, Notepad++ sees the hidden >> HTML version and Notepad++ assumes the paste is part of a larger HTML
    fragment. That puts Notepad++ into a strange internal state, which is
    apparently sometimes called "HTML fragment mode".

    I think notepad++ has a "paste special" command, does that let you pick
    the plain text version?

    Thanks for trying to help out, Andy, as I respect your knowledge and that
    of Paul, Carlos, and others, as I will always be open when I say I never
    really understood this stuff but I need to fix it with my Notepad++ macro.

    One of the first things I tried was Control+Shift+V and the Notepad++ hex editor, but neither seemed to help so I gave up on both of them long ago.

    Notepad++'s "Paste Simple Text" command only works if the clipboard
    actually contains a plain-text flavor that Notepad++ recognizes as usable.

    Chromium does put plain text on the clipboard but there's a catch.
    Chromium puts the HTML flavor first & marks it as the "preferred" format.

    Even when we use Paste Simple Text, Notepad++ still has to ask Windows for
    the available formats. If Windows reports the HTML flavor as the "best text-like format," Notepad++ may still pick it.

    Given that inconsistency, it seems...
    a. Ctrl+V always triggers HTML fragment mode
    b. Ctrl+Shift+V sometimes still triggers HTML fragment mode
    The only guaranteed way to break the mode is to modify the buffer (i.e., insert/delete a line).

    My issue is I can't fix it in my Notepad++ macro until I understand it.
    And to understand it, I'd like to *see* it. But it's invisible!

    Oddly, I don't see this problem with Mozilla browsers.
    Just Chromium-based browsers.

    Huh?

    Interestingly, both Chromium and Firefox can place multiple formats on the clipboard (plain text, HTML, images, etc.). That part is normal and very
    old Windows behavior as Carlos astutely already noted in a prior post.

    Chromium browsers (Edge, Chrome, Brave, Vivaldi, Electron apps) add extra metadata to the HTML clipboard flavor, e.g., StartHTML, EndHTML,
    StartFragment, EndFragment.

    Notepad++ interprets this as a fragment boundary and enters "HTML fragment mode." That's what breaks Ctrl+A and Ctrl+X.
    <https://github.com/notepad-plus-plus/notepad-plus-plus/issues/13748>

    Firefox does not add those fragment markers in the same way. It typically places plain text and a simpler HTML representation without the
    Chromium-style fragment metadata.
    <https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Interact_with_the_clipboard>

    Go figure. Did I mention I never understood all this character stuff?
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 10:01:48 2026
    From Newsgroup: alt.comp.os.windows-10

    On Tue, 2/10/2026 8:33 AM, Carlos E. R. wrote:
    On 2026-02-10 13:55, Paul wrote:

    What you need in hand, is a clipboard viewer, to see what is on offer
    and what Notepad++ may have accessed as its choice. So far, I have not
    spotted the "perfect" chunk of code for this.


    Ah, no perfect clipboard app yet?

    That's a reference to source code you could use for the purpose.
    There is a Powershell example or two, but they follow the same
    "Windows reasoning", that a user does not need to know what
    is on the clipboard.

    There may be a finished App for $39.95 to do it.

    There are various "laundering recipes" for fixing issue like that.
    Presumably this Notepad++ behavior has already been noted (somewhere).
    It would be unusual for "bad manners" to go unacknowledged.

    Paul
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Tue Feb 10 11:27:16 2026
    From Newsgroup: alt.comp.os.windows-10

    Paul wrote:
    There are various "laundering recipes" for fixing issue like that.
    Presumably this Notepad++ behavior has already been noted (somewhere).
    It would be unusual for "bad manners" to go unacknowledged.

    I wrote a Notepad++ Macro that "launders" text which is affected since I
    needed to copy/paste the text after the macro launders it to pure ASCII.

    Apparently when Chromium apps copy text, they don't just put CF_TEXT and CF_UNICODETEXT on the clipboard. They also include:
    a. CF_HTML - a full HTML fragment with metadata
    b. HTML Format - a Microsoft-defined clipboard format that includes:
    c. StartFragment / EndFragment markers
    d. Optional StartHTML / EndHTML offsets
    e. And a hidden boundary indicating an HTML document

    Notepad++ doesn't render HTML, but it does detect the presence of CF_HTML
    and switches into a special internal mode intended for HTML-aware pasting.

    It's invisible because Notepad++ doesn't expose it in the UI, so my simple brain said "if it's invisible, it's not there" and yet, it was there.

    In that mode, Notepad++ treats the invisible HTML fragment boundary as if
    it were a zero-width first line. So three things caused me issues.
    1. Ctrl+A flashes but selects nothing
    2. Ctrl+X does nothing
    3. The caret behaves strangely at the start of the document
    Because Notepad++ is obeying the HTML fragment metadata.

    CF_HTML is not inserted into the document, which is why it never showed up
    in the hex editor. It exists only on the clipboard, not in the pasted text.

    Manually inserting a blank line forces Notepad++ to reinterpret the buffer
    as plain text. The invisible HTML fragment boundary is still there. But the invisible HTML fragment is no longer at the top of the buffer. Then,
    deleting the blank line is forces a reparse of the entire document.

    What threw me off the trail was that Notepad++'s internal state changes
    based on the clipboard format, not based on the pasted bytes. So the text looked perfectly normal in Notepad++'s hex editor.

    I should re-try Andy's suggestion of Edit > Paste Special > Paste as ANSI
    Or use a plugin like "Paste as Plain Text" or use a clipboard cleaner such
    as PureText.

    For me, the issue was incredibly confusing because:
    a. The pasted text looks normal
    b. It doesn't happen with Firefox
    c. The broken behavior appears random
    d. There is no UI indicator that Notepad++ is in HTML fragment mode
    Because not only does it only happen when the clipboard contains CF_HTML
    but it persists until the buffer is modified in a way that forces a reset.

    Now that I know what I know, I can finally modify the shortcuts.xml macro
    so that it will consistently convert characters copied from web sites.
    --
    The nice thing about Usenet is you get good ideas from everyone.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 16:10:45 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote:
    There are various "laundering recipes" for fixing issue like that.
    Presumably this Notepad++ behavior has already been noted (somewhere).
    It would be unusual for "bad manners" to go unacknowledged.

    I remember reading references to clipboard managers, long ago. Maybe one
    by PC Magazine? Or maybe payware.

    Well, I guess I should look for a better clipboard manager than the native Windows but I've already fixed the Notepad++ shortcuts.xml to handle it.

    Since I'm a freeware junkie, I give my stuff away for free even as I've invested, oh, I don't know, scores of hours writing this unique macro.

    I'll post my latest Notepad++ shortcuts.xml file for all to benefit from.

    It has been tested against dirty Unicode-contaminated test cases with zero-width characters, combining marks, curly apostrophes, all sorts of
    weird spaces and invisible operators, Unicode dashes, line separators, etc.

    But it's not perfect yet, but it's almost professional quality (IIDSSM).
    (If someone paid me for it, then it would be professional quality.) :)

    It has been tested against this testcase, for example.

    ##############################
    # BASIC APOSTROPHE TESTS
    ##############################

    HererCOs
    HererCys
    Here-+s
    Here-|s
    Here-es
    Here-<s
    HerenRis
    HereN+cs
    Here-+s
    HererCcs
    Here-+s
    Here-+s

    ##############################
    # ZERO-WIDTH CHARACTERS
    ##############################

    HererCis (U+200B)
    HererCis (U+200D)
    HererCis (U+200C)
    HererUas (U+2060)
    HerrCies (U+200B between e and s)
    HererCis (U+200B at end)rCi
    HererCirCOs (U+200B between rCO and s)
    HerrCies (U+200C between e and s)
    HerrUaes (U+2060 between e and s)

    ##############################
    # COMBINING MARKS
    ##############################

    Here-as (U+0351)
    Here|cs (U+0307)
    Here|#s (U+0331)
    Here||s (U+0335)
    Here||s (U+0336)
    Here|+s (U+0337)
    Here|+s (U+0338)

    ##############################
    # DOUBLE QUOTES
    ##############################

    rCLHellorCY
    rCYHellorCL
    rCRHellorCf
    rYYHellorYR

    ##############################
    # DASHES AND MINUS SIGNS
    ##############################

    ArCoB (EN DASH U+2013)
    ArCoB (EM DASH U+2014)
    ArCoB (HORIZONTAL BAR U+2015)
    ArCEB (HYPHEN U+2010)
    ArCaB (NON-BREAKING HYPHEN U+2011)
    ArCAB (FIGURE DASH U+2012)
    AreAB (MINUS SIGN U+2212)
    ArUaB (BULLET DASH U+2043)

    ##############################
    # SPACES
    ##############################

    Hello World (NO-BREAK SPACE U+00A0)
    HellorCcWorld (FIGURE SPACE U+2007)
    HellorC>World (NARROW NBSP U+202F)
    HellorCeWorld (THIN SPACE U+2009)
    HellorCeWorld (PUNCTUATION SPACE U+2008)
    HellorCaWorld (SIX-PER-EM SPACE U+2006)
    HellorCeWorld (HAIR SPACE U+200A)

    ##############################
    # SYMBOLS AND DIACRITICS
    ##############################

    -f (U+011F)
    |i (U+00E1)
    +i (U+0161)
    -c (U+011B)
    rLo (U+2713)
    rCo (U+2022)
    raA (U+2192)
    -# (U+00B0)
    -- (U+00A9)
    -< (U+00AE)
    rao (U+2122)

    ##############################
    # INVISIBLE OPERATORS
    ##############################

    arUab (U+2060)
    arUib (U+2061)
    arUob (U+2062)
    arUub (U+2063)
    arUnb (U+2064)
    abaAb (U+180E)

    ##############################
    # SOFT HYPHEN AND LINE SEPARATORS
    ##############################

    soft-!hyphen (U+00AD)
    line
    separator (U+2028)
    para separator
    (U+2029)
    next
    line
    (U+0085)

    ##############################
    # MULTI-WORD STRESS TEST
    ##############################

    HererCOs HererCOs HererCOs HererCOs HererCOs
    HererCOs HererCOs HererCOs HererCOs HererCOs
    HererCOs HererCOs HererCOs HererCOs HererCOs
    HererCOs HererCOs HererCOs

    ##############################
    # MIXED CHAOS STRESS TEST
    ##############################

    HererCOs He-+rerCOs He-|rerCOs He-arerCOs HererCaisrCaarCatangledrCamessrCaofrCadashesrCoandrComarksrCa HererCOsrCiarCistringrCiwithrCizerorCawidthrCieverywhere
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 16:33:49 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:
    But it's not perfect yet, but it's almost professional quality (IIDSSM).
    (If someone paid me for it, then it would be professional quality.) :)

    It has been tested against this testcase, for example.

    Aurgh. It never ends. :(

    I'm trying to give you guys professional code, so I'm testing thoroughly.

    Here's another test case that it failed so I'm working on adding a shortcutx.xml duplication section to cover when Notepad++'s macro engine
    fails to match U+200B when it sits between two characters that were just replaced, especially ' + U+200B + s.

    ##############################
    # U+200B SECOND-PASS TEST
    ##############################

    # Case 1: U+200B between apostrophe and s
    Here'rCis (U+200B)

    # Case 2: U+200B after apostrophe
    Here'rCi s (U+200B)

    # Case 3: U+200B before apostrophe
    Here rCi's (U+200B)

    # Case 4: U+200B between two ASCII letters
    HerCire (U+200B)

    # Case 5: U+200B at end of word
    HererCi (U+200B)

    # Case 6: U+200B at start of word
    rCiHere (U+200B)

    # Case 7: U+200B between two Unicode characters
    He-+rCire (U+200B)

    # Case 8: U+200B between normalized apostrophe and s
    Here'rCis (U+200B)
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 18:00:47 2026
    From Newsgroup: alt.comp.os.windows-10

    This difficult test is proving that Notepad++ isn't running the macros in
    the order they appear in the shortcuts.xml file (apparently) which is
    driving me nuts. A rewrite is needed so here's the testcase & current file.

    ##############################
    # BASIC APOSTROPHE TESTS
    ##############################

    HererCOs (U+2019)
    HererCys (U+2018)
    Here-+s (U+02BC)
    Here-|s (U+02B9)
    Here-es (U+02C8)
    Here-<s (U+02EE)
    HerenRis (U+A78C)
    HereN+cs (U+FF07)
    Here-+s (U+02BC)
    HererCcs (U+201B)
    Here-+s (U+02BE)
    Here-+s (U+02BF)

    ##############################
    # ZERO-WIDTH CHARACTERS
    ##############################

    HererCis (U+200B)
    HererCis (U+200D)
    HererCis (U+200C)
    HererUas (U+2060)
    HererCis (U+200B)
    HererCis (U+200B)
    Here'rCis (U+200B)
    HererCis (U+200C)
    HererUas (U+2060)

    ##############################
    # COMBINING MARKS
    ##############################

    Here-as (U+0351)
    Here|cs (U+0307)
    Here|#s (U+0331)
    Here||s (U+0335)
    Here||s (U+0336)
    Here|+s (U+0337)
    Here|+s (U+0338)

    ##############################
    # DOUBLE QUOTES
    ##############################

    rCLHellorCY (U+201C / U+201D)
    rCRHellorCY (U+201E / U+201D)
    rCfHellorCY (U+201F)
    rYYHellorYR (U+275D / U+275E)

    ##############################
    # DASHES AND MINUS SIGNS
    ##############################

    ArCEB (U+2010)
    ArCaB (U+2011)
    ArCAB (U+2012)
    ArCoB (U+2014)
    ArCoB (U+2013)
    ArCoB (U+2015)
    AreAB (U+2212)
    ArUaB (U+2043)

    ##############################
    # SPACES
    ##############################

    Hello World (U+00A0)
    HellorCcWorld (U+2007)
    HellorC>World (U+202F)
    HellorCeWorld (U+2009)
    HellorCeWorld (U+2008)
    HellorCaWorld (U+2006)
    HellorCeWorld (U+200A)

    ##############################
    # SYMBOLS AND DIACRITICS
    ##############################

    -f (U+011F)
    |i (U+00E1)
    +i (U+0161)
    -c (U+011B)
    rLo (U+2713)
    rCo (U+2022)
    raA (U+2192)
    -# (U+00B0)
    -- (U+00A9)
    -< (U+00AE)
    rao (U+2122)

    ##############################
    # INVISIBLE OPERATORS
    ##############################

    abrUa (U+2060)
    abrUi (U+2061)
    abrUo (U+2062)
    abrUu (U+2063)
    abrUn (U+2064)
    abbaA (U+180E)

    ##############################
    # SOFT HYPHEN AND LINE SEPARATORS
    ##############################

    soft-!hyphen (U+00AD)
    line
    separator (U+2028)
    paraseparator (U+2029)
    next
    line (U+0085)

    ##############################
    # MULTI-WORD STRESS TEST
    ##############################

    HererCOs HererCOs HererCOs HererCOs HererCOs (U+2019)
    HererCOs HererCOs HererCOs HererCOs HererCOs (U+2019)
    HererCOs HererCOs HererCOs HererCOs HererCOs (U+2019)
    HererCOs HererCOs HererCOs (U+2019)

    ##############################
    # MIXED CHAOS STRESS TEST
    ##############################

    HererCOs He'rerCOs He'rerCOs HererCOs (U+2019) HererCaisrCaarCatangledrCamessrCaofrCadashesrCaandrCamarks... (U+2011) HererCOsastringwithzerorCawidtheverywhere (U+200B)
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 18:02:09 2026
    From Newsgroup: alt.comp.os.windows-10

    <?xml version="1.0" encoding="UTF-8" ?>
    <!-- C:\app\editor\txt\Notepad++\shortcuts.xml for Windows Notepad++ -->
    <!-- v3p9 20260211 Notepad++ is not running the macro in the order shown -->
    <!-- Notepad++ is executing macro actions in a different order -->
    <!-- than they appear in the XML so a total rewrite is needed in v4p0 -->
    <!-- v3p8 20260211 U+2060 is driving me nuts so it's the first block now -->
    <!-- v3p7 20260211 moved U+2060 up because it's the most disruptive -->
    <!-- v3p6 20260211 U+2009 & U+200B not being converted properly -->
    <!-- v3p5 20260211 fixed U+200B failing when U+200B is between ' & s -->
    <!-- A 2nd pass was duplicated after apostrophe normalization rules -->
    <!-- v3p4 20260211 added U+275E (heavy double quote right) -->
    <!-- v3p3 20260211 added U+2009 (thin space) -->
    <!-- v3p2 20260211 added seven new conversions after running testcases -->
    <!-- U+02BE (modifier letter right half ring) -->
    <!-- U+02BF (modifier letter left half ring) -->
    <!-- U+201E (double low-9 quote) -->
    <!-- U+201F (double high-reversed-9 quote) -->
    <!-- U+275D (heavy double quote left) -->
    <!-- U+275E (heavy double quote right) -->
    <!-- U+2015 (horizontal bar) -->
    <!-- U+2009 (thin space) -->
    <!-- v3p1 20260211 reorganized into a dozen distinct categories -->
    <!-- (1) control characters: U+000F U+0001 -->
    <!-- (2) dashes & minus signs: U+2010 U+2011 U+2012 U+2212 -->
    <!-- (3) zero-width characters: U+200C U+200B U+200D U+FEFF U+2060 -->
    <!-- (4) special spaces: U+00A0 U+2007 U+202F U+200A U+2008 U+2006 -->
    <!-- (5) apostrophe-like characters:
    U+0F0C U+2018 U+2019 U+2032 U+02BC U+02B9 U+02C8 U+02EE
    U+201B U+02CB U+A78C U+FF07 -->
    <!-- (6) combining marks (remove after apostrophes):
    U+0351 U+0307 U+0331 U+0335 U+0336 U+0337 U+0338 -->
    <!-- (7) double-quote normalization: U+201C U+201D -->
    <!-- (8) dash-like & ellipsis & HTML entities:
    U+2026 &#151; U+2014 U+2013 &zwnj; -->
    <!-- (9) bullets, math symbols, diacritics:
    U+2022 U+8722 U+011F U+2009 U+00E1 U+0161 U+011B -->
    <!-- (10) miscellaneous symbols:
    U+2713 ASCII hyphen ` U+2192 U+00B0 U+00A9 U+2122 U+00AE -->
    <!-- (11) invisible operators:
    U+00AD U+2061 U+2062 U+2063 U+2064 U+180E -->
    <!-- (12) line separators: U+2028 U+2029 U+0085 -->
    <!-- v3p0 20260211 added combining marks U+0351 U+0307 U+0331 -->
    <!-- v3p1 20260211 added apostrophe-like characters U+201B U+02CB -->
    <!-- v2p9 20260211 moved U+2060 to be above apostrophe-related blocks -->
    <!-- v2p8 20260211 fixed Chromium CF_HTML paste control+A anomaly -->
    <!-- v2p7 20260211 added U+02EE modifier letter double apostrophe rule -->
    <!-- v2p6 20260211 fixed U+02C8 modifier letter vertical line) rule -->
    <!-- v2p5 20260211 fixed U+02B9 (modifier letter prime) rule -->
    <!-- v2p4 20260211 removed one of two U+000F blocks -->
    <!-- v2p3 20260211 removed two (duplicate) 1700 lines in U+0161 -->
    <!-- v2p2 20260211 fixed all zero-width blocks to replace with nothing -->
    <!-- v2p1 20260211 fixed BOM to replace with nothing -->
    <!-- v2p0 20260210 cleaned (emptied out) closing sections of the file -->
    <!-- v1p9 20260210 ported old shortcuts.xml to improve coverage -->
    <!-- Cleans Chromium pasted text & normalizes Unicode to ASCII -->
    <!-- Use model: paste (using control+v) & fix (using control+b) -->
    <!-- The macro should 1st break CF_HTML fragment mode (so Ctrl+A works) -->
    <!-- and then run the Unicode-to-ASCII cleanup on all the pasted text -->
    <!-- cutting (control+x) the result back into the Windows clipboard -->
    <!-- thereby leaving the Notepad++ GUI empty & ready for the next paste-->

    <!--
    To break Scintillaos CF_HTML fragment mode, we need to make any edit.
    We can insert a space & then delete that space, for example.
    -->

    <!-- Scintilla engine command meanings:
    1700 = begin a new search/replace operation
    1601 = set the search string (the Unicode character to find)
    1625 = clear the replacement buffer
    1602 = set the replacement string (ASCII equivalent)
    1702 = execute Replace All
    1701 = end this search/replace block
    2001 = SCI_REPLACESEL (inserts a space)
    2326 = SCI_DELETEBACK (deletes the space)
    2013 = SCI_SELECTALL (selects everything)
    2177 = SCI_CUT (cut all)
    -->

    <NotepadPlus>
    <InternalCommands>
    <Shortcut id="43009" Ctrl="no" Alt="no" Shift="no" Key="0" />
    </InternalCommands>
    <Macros>

    <!-- When you paste from a Chromium-based app, the clipboard contains:
    CF_UNICODETEXT (plain text) & CF_HTML (HTML fragment)
    And sometimes CF_RTF where Notepad++ prefers CF_HTML if available.

    v2p0 fixes a Notepad++ selection issue caused by CF_HTML pastes.
    "HTML Paste Mode" prevents the "Control+A" from working.
    "HTML paste mode" inserts HTML fragment as plain text
    where Ctrl+A is disabled until the buffer is "normalized"
    (until the first edit that breaks the fragment state)
    -->

    <!-- ASCII "control+b" Cleanup Macro -->
    <Macro name="ASCII" Ctrl="yes" Alt="no" Shift="no" Key="66">

    <!-- Begin Scintilla HTML-paste workaround top portion -->
    <!-- Break Chromium CF_HTML fragment mode by adding & deleting a space-->
    <Action type="0" message="2001" wParam="32" lParam="0" sParam="" />
    <Action type="0" message="2326" wParam="0" lParam="0" sParam="" />
    <!-- Select all text before running cleanup -->
    <Action type="0" message="2013" wParam="0" lParam="0" sParam="" />
    <!-- End Scintilla HTML-paste workaround top portion -->

    <!-- BEGIN CONVERSION BLOCKS -->

    <!-- U+2060 is driving me nuts so I'm making it the 1st block -->
    <!-- U+2060 must be placed above the apostrophe-related blocks -->
    <!-- Otherwise apostrophe block may skip over it -->
    <!-- U+2060 is disruptive as it must be placed above zero-width too -->
    <!-- Replace U+2060 (WORD JOINER) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2060;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #1. CONTROL CHARACTERS (remove first) -->
    <!-- Replace U+000F (SHIFT-OUT control character) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x000F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0001 (SOH control character) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0001;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #2. DASHES & MINUS SIGNS (safest to remove early) -->
    <!-- Replace U+2010 (HYPHEN) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2010;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2011 (NON-BREAKING HYPHEN) with ASCII hyphen "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2011;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2012 (FIGURE DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2012;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2212 (MINUS SIGN) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2212;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #3. ZERO-WIDTH CHARACTERS (must be BEFORE apostrophes) -->

    <!-- Replace U+200C (ZERO WIDTH NON-JOINER) with "" (nothing) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200B (ZERO WIDTH SPACE) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200D (ZERO WIDTH JOINER) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+FEFF (BOM) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xFEFF;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #4. SPECIAL SPACES (convert to ASCII space) -->
    <!-- Replace U+00A0 (NO-BREAK SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00A0;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2007 (FIGURE SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2007;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+202F (NARROW NO-BREAK SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x202F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200A (HAIR SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200A;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2008 (PUNCTUATION SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2008;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2006 (SIX-PER-EM SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2006;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #5. APOSTROPHE-LIKE CHARACTERS -->
    <!-- Replace U+0F0C (TIBETAN MARK DELIMITER) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0F0C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2018 (LEFT SINGLE QUOTE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2018;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2019 (RIGHT SINGLE QUOTATION) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2019;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2032 (PRIME) with ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2032;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BC (MODIFIER LETTER APOSTROPHE) with ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BC;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02B9 (MODIFIER LETTER PRIME) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02B9;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02C8 (MODIFIER LETTER VERTICAL) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02C8;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02EE (MODIFIER DOUBLE APOSTROPHE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02EE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- U+201B (SINGLE HIGH-REVERSED-9 QUOTATION MARK) with apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02CB (MODIFIER LETTER GRAVE ACCENT) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02CB;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- This is is a duplication which is after the apostrophes -->
    <!-- When U+200B appears between two characters that were already replaced -->
    <!-- the first pass fails to remove it, so I added this duplicate -->
    <!-- Remove U+200B (ZERO-WIDTH SPACE) second pass -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #6. COMBINING MARKS (remove only after apostrophes are done) -->
    <!-- Remove U+0351 (COMBINING RIGHT HALF RING ABOVE) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0351;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0307 (COMBINING DOT ABOVE) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0307;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0331 (COMBINING MACRON BELOW) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0331;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #7. DOUBLE-QUOTE NORMALIZATION -->
    <!-- Replace U+201C (LEFT DOUBLE QUOTE) with ASCII double quote " -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam='&quot;' />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201D (RIGHT DOUBLE QUOTE) with ASCII double quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam='&quot;' />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #8. ELLIPSIS, EM DASH, EN DASH, HTML ENTITIES -->
    <!-- Replace U+2026 (HORIZONTAL ELLIPSIS) with ASCII "..." -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2026;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="..." />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace literal &#151; (HTML entity for EM DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&amp;#151;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2014 (EM DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2014;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2013 (EN DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2013;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace literal &zwnj; (ZERO WIDTH NON-JOINER entity) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&zwnj;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #9. BULLETS, MATH SYMBOLS, LETTERS WITH DIACRITICS -->
    <!-- Replace U+2022 (BULLET) with ASCII "*" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2022;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&#x002A;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+8722 (MATHEMATICAL MINUS variant) with ASCII "&" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x8722;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&amp;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+011F (LATIN SMALL G WITH BREVE) with ASCII "g" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x11f;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="g" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+00E1 (LATIN SMALL A WITH ACUTE) with ASCII "a" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xe1;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="a" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0161 (LATIN SMALL S WITH CARON) with ASCII "s" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x161;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="s" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+011B (LATIN SMALL E WITH CARON) with ASCII "e" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x11b;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="e" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #10. MISCELLANEOUS SYMBOLS -->
    <!-- Replace U+2713 (CHECK MARK) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2713;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace ASCII hyphen "-" with ASCII hyphen "-" (normalize) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace backtick with ASCII single quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="`" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Unicode Arrow (U+2192) with ASCII dash greaterthan -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2192;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="->" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace degree symbol with deg -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00B0;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="deg" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace copyright symbol U??? with (C) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00A9;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(C)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Trademark (U+2122) with (TM) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2122;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(TM)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Registered (U+00AE) with (R) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00AE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(R)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #11. INVISIBLE OPERATORS (remove) -->
    <!-- Replace U+00AD (SOFT HYPHEN) with "" (remove completely) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00AD;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2061 (FUNCTION APPLICATION) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2061;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2062 (INVISIBLE TIMES) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2062;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2063 (INVISIBLE SEPARATOR) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2063;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2064 (INVISIBLE PLUS) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2064;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+180E (MONGOLIAN VOWEL SEPARATOR) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x180E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #12. LINE SEPARATORS -->
    <!-- Replace U+2028 (LINE SEPARATOR) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2028;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2029 (PARAGRAPH SEPARATOR) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2029;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0085 (NEXT LINE / NEL) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0085;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+A78C (LATIN SMALL LETTER SALTILLO) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xA78C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+FF07 (FULLWIDTH APOSTROPHE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xFF07;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0335 (COMBINING SHORT STROKE OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0335;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0336 (COMBINING LONG STROKE OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0336;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0337 (COMBINING SHORT SOLIDUS OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0337;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0338 (COMBINING LONG SOLIDUS OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0338;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2043 (HYPHEN BULLET) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2043;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BE (MODIFIER LETTER RIGHT HALF RING) w/ ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BF (MODIFIER LETTER LEFT HALF RING) w/ ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BF;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201E (DOUBLE LOW-9 QUOTATION MARK) w/ ASCII double quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201F (DOUBLE HIGH-REVERSED-9 QUOTATION MARK) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+275D (HEAVY DOUBLE QUOTATION MARK ORNAMENT LEFT) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x275D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+275E (HEAVY DOUBLE QUOTATION MARK ORNAMENT RIGHT) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x275E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2015 (HORIZONTAL BAR) with ASCII hyphen -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2015;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2009 (THIN SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2009;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2009 (THIN SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2009;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- END OF CONVERSION BLOCKS -->

    <!-- Begin Scintilla HTML-paste workaround bottom portion -->
    <!-- Select all cleaned text -->
    <Action type="0" message="2013" wParam="0" lParam="0" sParam="" />

    <!-- Cut cleaned text to Windows clipboard -->
    <Action type="0" message="2177" wParam="0" lParam="0" sParam="" />
    <!-- End Scintilla HTML-paste workaround bottom portion -->

    <!-- Notepad++ will save shortcuts.xml automatically as it
    rewrites the file whenever shortcuts/macros/plugins change.
    Some sections are required so empty sections will be recreated.
    -->

    </Macro>
    </Macros>
    <UserDefinedCommands>
    </UserDefinedCommands>

    <PluginCommands />
    <ScintillaKeys />

    </NotepadPlus>
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 18:21:06 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:

    <NotepadPlus>
    <InternalCommands>
    <Shortcut id="43009" Ctrl="no" Alt="no" Shift="no" Key="0" />
    </InternalCommands>
    <Macros>

    Whomever it was that wrote Notepad++/Scintilla should be shot.

    Notepad++ rewrote my file because I did NOT put that in the file.
    Certainly I would have put it at the bottom besides.
    <NotepadPlus>
    <InternalCommands>...</InternalCommands>
    <Macros>
    <Macro name="ASCII" ...>
    <Action>...</Action>
    <Action>...</Action>
    ...
    </Macro>
    </Macros>
    </NotepadPlus>

    Apparently Notepad++ regenerates the XML in this order:
    <NotepadPlus>
    <InternalCommands>
    <Macros>
    <UserDefinedCommands>
    <ScintillaKeys>
    <UserShortcuts>
    <PluginCommands>
    <Macros> content

    It does not respect my original order.
    Specifically it reorders <Action> blocks.
    Which it does extremely unintelligently.

    Such that U+2060 being executed after the other zero-width characters.
    Aurgh. I didn't realize Notepad++ was messing with the shortcuts.xml!

    I have to force Notepad++ to rebuild the macro action list.
    1. Open Notepad++ without shortcuts.xml in any editor.
    2. Go to Macro -> Modify Shortcut / Delete Macro
    3. Find the macro in the list & select it.
    4. Click "Delete" & then close Notepad++ completely
    5. Open shortcuts.xml in gVim (or any editor that is NOT Notepad++).
    6. Paste the original macro back into shortcuts.xml.
    Make sure the U+2060 removal block is the very first Action
    in the entire macro, before any other Action.
    7. Save the file making sure nothing else related is running.
    8. Start Notepad++ again and re-run the grueling test.

    This changed the shortcuts.xml to the following:
    <?xml version="1.0" encoding="UTF-8" ?>
    <!-- C:\app\editor\txt\Notepad++\shortcuts.xml for Windows Notepad++ -->
    <!-- v3p9 20260211 Notepad++ is not running the macro in the order shown

    <!-- Notepad++ is executing macro actions in a different order -->
    <!-- than they appear in the XML so a total rewrite is needed in v4p0 -->
    <!-- v3p8 20260211 U+2060 is driving me nuts so it's the first block now

    <!-- v3p7 20260211 moved U+2060 up because it's the most disruptive -->
    <!-- v3p6 20260211 U+2009 & U+200B not being converted properly -->
    <!-- v3p5 20260211 fixed U+200B failing when U+200B is between ' & s -->
    <!-- A 2nd pass was duplicated after apostrophe normalization rules

    <!-- v3p4 20260211 added U+275E (heavy double quote right) -->
    <!-- v3p3 20260211 added U+2009 (thin space) -->
    <!-- v3p2 20260211 added seven new conversions after running testcases -->
    <!-- U+02BE (modifier letter right half ring) -->
    <!-- U+02BF (modifier letter left half ring) -->
    <!-- U+201E (double low-9 quote) -->
    <!-- U+201F (double high-reversed-9 quote) -->
    <!-- U+275D (heavy double quote left) -->
    <!-- U+275E (heavy double quote right) -->
    <!-- U+2015 (horizontal bar) -->
    <!-- U+2009 (thin space) -->
    <!-- v3p1 20260211 reorganized into a dozen distinct categories -->
    <!-- (1) control characters: U+000F U+0001 -->
    <!-- (2) dashes & minus signs: U+2010 U+2011 U+2012 U+2212 -->
    <!-- (3) zero-width characters: U+200C U+200B U+200D U+FEFF U+2060 -->
    <!-- (4) special spaces: U+00A0 U+2007 U+202F U+200A U+2008 U+2006 -->
    <!-- (5) apostrophe-like characters:
    U+0F0C U+2018 U+2019 U+2032 U+02BC U+02B9 U+02C8 U+02EE
    U+201B U+02CB U+A78C U+FF07 -->
    <!-- (6) combining marks (remove after apostrophes):
    U+0351 U+0307 U+0331 U+0335 U+0336 U+0337 U+0338 -->
    <!-- (7) double-quote normalization: U+201C U+201D -->
    <!-- (8) dash-like & ellipsis & HTML entities:
    U+2026 &#151; U+2014 U+2013 &zwnj; -->
    <!-- (9) bullets, math symbols, diacritics:
    U+2022 U+8722 U+011F U+2009 U+00E1 U+0161 U+011B -->
    <!-- (10) miscellaneous symbols:
    U+2713 ASCII hyphen ` U+2192 U+00B0 U+00A9 U+2122 U+00AE -->
    <!-- (11) invisible operators:
    U+00AD U+2061 U+2062 U+2063 U+2064 U+180E -->
    <!-- (12) line separators: U+2028 U+2029 U+0085 -->
    <!-- v3p0 20260211 added combining marks U+0351 U+0307 U+0331 -->
    <!-- v3p1 20260211 added apostrophe-like characters U+201B U+02CB -->
    <!-- v2p9 20260211 moved U+2060 to be above apostrophe-related blocks -->
    <!-- v2p8 20260211 fixed Chromium CF_HTML paste control+A anomaly -->
    <!-- v2p7 20260211 added U+02EE modifier letter double apostrophe rule -->
    <!-- v2p6 20260211 fixed U+02C8 modifier letter vertical line) rule -->
    <!-- v2p5 20260211 fixed U+02B9 (modifier letter prime) rule -->
    <!-- v2p4 20260211 removed one of two U+000F blocks -->
    <!-- v2p3 20260211 removed two (duplicate) 1700 lines in U+0161 -->
    <!-- v2p2 20260211 fixed all zero-width blocks to replace with nothing -->
    <!-- v2p1 20260211 fixed BOM to replace with nothing -->
    <!-- v2p0 20260210 cleaned (emptied out) closing sections of the file -->
    <!-- v1p9 20260210 ported old shortcuts.xml to improve coverage -->
    <!-- Cleans Chromium pasted text & normalizes Unicode to ASCII -->
    <!-- Use model: paste (using control+v) & fix (using control+b) -->
    <!-- The macro should 1st break CF_HTML fragment mode (so Ctrl+A works) --> <!-- and then run the Unicode-to-ASCII cleanup on all the pasted text -->
    <!-- cutting (control+x) the result back into the Windows clipboard -->
    <!-- thereby leaving the Notepad++ GUI empty & ready for the next paste-->
    <!--
    To break Scintilla's CF_HTML fragment mode, we need to make any edit.
    We can insert a space & then delete that space, for example.

    <!-- Scintilla engine command meanings:
    1700 = begin a new search/replace operation
    1601 = set the search string (the Unicode character to find)
    1625 = clear the replacement buffer
    1602 = set the replacement string (ASCII equivalent)
    1702 = execute Replace All
    1701 = end this search/replace block
    2001 = SCI_REPLACESEL (inserts a space)
    2326 = SCI_DELETEBACK (deletes the space)
    2013 = SCI_SELECTALL (selects everything)
    2177 = SCI_CUT (cut all)

    <!-- When you paste from a Chromium-based app, the clipboard contains:
    CF_UNICODETEXT (plain text) & CF_HTML (HTML fragment)
    And sometimes CF_RTF where Notepad++ prefers CF_HTML if available.

    v2p0 fixes a Notepad++ selection issue caused by CF_HTML pastes.
    "HTML Paste Mode" prevents the "Control+A" from working.
    "HTML paste mode" inserts HTML fragment as plain text
    where Ctrl+A is disabled until the buffer is "normalized"
    (until the first edit that breaks the fragment state)

    <!-- Below is garbage that Notepad++ adds to shortcuts.xml -->
    <NotepadPlus>
    <InternalCommands>
    <Shortcut id="43009" Ctrl="no" Alt="no" Shift="no" Key="0" />
    </InternalCommands>
    <Macros />
    <UserDefinedCommands />
    <PluginCommands />
    <ScintillaKeys />
    </NotepadPlus>

    This proves that only the comments remained.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 18:37:51 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:
    I have to force Notepad++ to rebuild the macro action list.
    1. Open Notepad++ without shortcuts.xml in any editor.
    2. Go to Macro -> Modify Shortcut / Delete Macro
    3. Find the macro in the list & select it.
    4. Click "Delete" & then close Notepad++ completely
    5. Open shortcuts.xml in gVim (or any editor that is NOT Notepad++).
    6. Paste the original macro back into shortcuts.xml.
    Make sure the U+2060 removal block is the very first Action
    in the entire macro, before any other Action.
    7. Save the file making sure nothing else related is running.
    8. Start Notepad++ again and re-run the grueling test.

    OMG. Give me a gun. I want to shoot Scintilla's developers! :)
    (just kidding)

    But I am frustrated... as they wasted my valuable time.

    I finally figured out WHY Notepad++ wasn't running the macros
    in the order I wrote them.

    I had to FORCE Notepad++ to rebuilt the entire macro from scratch!
    Then it worked fine in the same testcase it's been failing on.

    Jesus Christ. That's NOT intuitive.
    It's only intuitive AFTER you realize Notepad++ is rewriting
    the shortcuts.xml (keeping only your comments of what you add).

    Who knew?
    Not me.
    Now I do!

    Notepad++ does NOT execute macros directly from shortcuts.xml.
    Once I forced it to read macros from shortcuts.xml...
    Apostrophes are working.
    Zero-width characters too.
    Combining marks also.
    Double quotes, dashes, symbols and diacritics are working.
    So are invisible operators.
    Soft hyphen & line separators
    Even the Multi-word stress test is working.
    As is the mixed-chaos jumbled test.

    And yet, I changed nothing in the shortcuts.xml file!
    Jesus Christ.

    There was nothing wrong with the programming on my side.
    It was all because Notepad++ doesn't do what we think it does.

    Sheesh. Drives me nuts when I can't solve a problem.
    As I almost never fail - so it was driving me nuts.

    Mainly because I didn't understand what was happening.
    My "logic" was fine (as I'm extremely logical).

    It was simply that I was blindsided by Notepad++ changing the order.
    Without telling me it changed the order.

    By deleting the macro, the macro was finally rebuilt internally.
    Notepad++ finally loaded the new action order
    U+2060 finally executed first!

    When Notepad++ starts, it reads shortcuts.xml ONCE.
    It loads all macros into an INTERNAL ARRAY in memory.
    After that, the XML file is ignored.

    Only when we edit shortcuts.xml while Notepad++ is closed,
    does the next startup load the new version... but... but...

    But...

    If the XML structure is malformed,
    or the macro is outside the <Macros> block,
    or the <Macros> block is empty,
    or Notepad++ rewrites the file,
    or the macro name changes,
    or the macro is deleted and not recreated,

    then Notepad++ will:

    silently discard the macro,
    rebuild the XML in its own structure,
    and load the LAST VALID INTERNAL VERSION it had.

    This looks like"Notepad++ is using an old cache.
    But it's not a cache.
    It's the internal macro array

    Deleting the macro inside Notepad++ clears the internal array.

    I'm out of energy for today, but I will try to build a
    version 4.0 (v4p0) that Notepad++ cannot reorder internally.
    --
    The problem with computers is we humans are trying to beat them.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 18:48:05 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:
    I'm out of energy for today, but I will try to build a
    version 4.0 (v4p0) that Notepad++ cannot reorder internally.

    Here is the last known good version of shortcuts.xml before
    I try to change it so that Notepad++ can't reorder it.

    Since if something takes a dozen steps, I halve it to six,
    and then to 3 and then to 1, this is 2 steps.

    Step A is control+c something off the web using Chromium
    Step B is Win+R > n (to bring up notepad++)

    Step 1 is control+v into notepad++
    Step 2 is control+b to run the macro (which does many things)

    The macro handles a lot of things but it first and foremost resolves
    the HTML fragment issue by adding and deleting a space, but then it automatically selects all the pasted text and cleans it up and
    then wipes it out after putting it back into the Windows clipboard.

    At this point, after control+b, the Windows clipboard has been cleaned
    of Unicode and is as pure ASCII as I can make the Windows clipboard be.

    Whew!

    <?xml version="1.0" encoding="UTF-8" ?>
    <!-- C:\app\editor\txt\Notepad++\shortcuts.xml for Windows Notepad++ -->
    <!-- v3p9 20260211 Notepad++ is not running the macro in the order shown -->
    <!-- Notepad++ is executing macro actions in a different order -->
    <!-- than they appear in the XML so a total rewrite is needed in v4p0 -->
    <!-- v3p8 20260211 U+2060 is driving me nuts so it's the first block now -->
    <!-- v3p7 20260211 moved U+2060 up because it's the most disruptive -->
    <!-- v3p6 20260211 U+2009 & U+200B not being converted properly -->
    <!-- v3p5 20260211 fixed U+200B failing when U+200B is between ' & s -->
    <!-- A 2nd pass was duplicated after apostrophe normalization rules -->
    <!-- v3p4 20260211 added U+275E (heavy double quote right) -->
    <!-- v3p3 20260211 added U+2009 (thin space) -->
    <!-- v3p2 20260211 added seven new conversions after running testcases -->
    <!-- U+02BE (modifier letter right half ring) -->
    <!-- U+02BF (modifier letter left half ring) -->
    <!-- U+201E (double low-9 quote) -->
    <!-- U+201F (double high-reversed-9 quote) -->
    <!-- U+275D (heavy double quote left) -->
    <!-- U+275E (heavy double quote right) -->
    <!-- U+2015 (horizontal bar) -->
    <!-- U+2009 (thin space) -->
    <!-- v3p1 20260211 reorganized into a dozen distinct categories -->
    <!-- (1) control characters: U+000F U+0001 -->
    <!-- (2) dashes & minus signs: U+2010 U+2011 U+2012 U+2212 -->
    <!-- (3) zero-width characters: U+200C U+200B U+200D U+FEFF U+2060 -->
    <!-- (4) special spaces: U+00A0 U+2007 U+202F U+200A U+2008 U+2006 -->
    <!-- (5) apostrophe-like characters:
    U+0F0C U+2018 U+2019 U+2032 U+02BC U+02B9 U+02C8 U+02EE
    U+201B U+02CB U+A78C U+FF07 -->
    <!-- (6) combining marks (remove after apostrophes):
    U+0351 U+0307 U+0331 U+0335 U+0336 U+0337 U+0338 -->
    <!-- (7) double-quote normalization: U+201C U+201D -->
    <!-- (8) dash-like & ellipsis & HTML entities:
    U+2026 &#151; U+2014 U+2013 &zwnj; -->
    <!-- (9) bullets, math symbols, diacritics:
    U+2022 U+8722 U+011F U+2009 U+00E1 U+0161 U+011B -->
    <!-- (10) miscellaneous symbols:
    U+2713 ASCII hyphen ` U+2192 U+00B0 U+00A9 U+2122 U+00AE -->
    <!-- (11) invisible operators:
    U+00AD U+2061 U+2062 U+2063 U+2064 U+180E -->
    <!-- (12) line separators: U+2028 U+2029 U+0085 -->
    <!-- v3p0 20260211 added combining marks U+0351 U+0307 U+0331 -->
    <!-- v3p1 20260211 added apostrophe-like characters U+201B U+02CB -->
    <!-- v2p9 20260211 moved U+2060 to be above apostrophe-related blocks -->
    <!-- v2p8 20260211 fixed Chromium CF_HTML paste control+A anomaly -->
    <!-- v2p7 20260211 added U+02EE modifier letter double apostrophe rule -->
    <!-- v2p6 20260211 fixed U+02C8 modifier letter vertical line) rule -->
    <!-- v2p5 20260211 fixed U+02B9 (modifier letter prime) rule -->
    <!-- v2p4 20260211 removed one of two U+000F blocks -->
    <!-- v2p3 20260211 removed two (duplicate) 1700 lines in U+0161 -->
    <!-- v2p2 20260211 fixed all zero-width blocks to replace with nothing -->
    <!-- v2p1 20260211 fixed BOM to replace with nothing -->
    <!-- v2p0 20260210 cleaned (emptied out) closing sections of the file -->
    <!-- v1p9 20260210 ported old shortcuts.xml to improve coverage -->
    <!-- Cleans Chromium pasted text & normalizes Unicode to ASCII -->
    <!-- Use model: paste (using control+v) & fix (using control+b) -->
    <!-- The macro should 1st break CF_HTML fragment mode (so Ctrl+A works) -->
    <!-- and then run the Unicode-to-ASCII cleanup on all the pasted text -->
    <!-- cutting (control+x) the result back into the Windows clipboard -->
    <!-- thereby leaving the Notepad++ GUI empty & ready for the next paste-->

    <!--
    To break Scintilla's CF_HTML fragment mode, we need to make any edit.
    We can insert a space & then delete that space, for example.
    -->

    <!-- Scintilla engine command meanings:
    1700 = begin a new search/replace operation
    1601 = set the search string (the Unicode character to find)
    1625 = clear the replacement buffer
    1602 = set the replacement string (ASCII equivalent)
    1702 = execute Replace All
    1701 = end this search/replace block
    2001 = SCI_REPLACESEL (inserts a space)
    2326 = SCI_DELETEBACK (deletes the space)
    2013 = SCI_SELECTALL (selects everything)
    2177 = SCI_CUT (cut all)
    -->

    <!-- When you paste from a Chromium-based app, the clipboard contains:
    CF_UNICODETEXT (plain text) & CF_HTML (HTML fragment)
    And sometimes CF_RTF where Notepad++ prefers CF_HTML if available.

    v2p0 fixes a Notepad++ selection issue caused by CF_HTML pastes.
    "HTML Paste Mode" prevents the "Control+A" from working.
    "HTML paste mode" inserts HTML fragment as plain text
    where Ctrl+A is disabled until the buffer is "normalized"
    (until the first edit that breaks the fragment state)
    -->

    <!-- Below is garbage that Notepad++ adds to shortcuts.xml -->
    <NotepadPlus>
    <InternalCommands>
    <Shortcut id="43009" Ctrl="no" Alt="no" Shift="no" Key="0" />
    </InternalCommands>
    <Macros>
    <!-- Above is garbage that Notepad++ adds to shortcuts.xml -->

    <!-- ASCII "control+b" Cleanup Macro -->
    <Macro name="ASCII" Ctrl="yes" Alt="no" Shift="no" Key="66">

    <!-- Begin Scintilla HTML-paste workaround top portion -->
    <!-- Break Chromium CF_HTML fragment mode by adding & deleting a space-->
    <Action type="0" message="2001" wParam="32" lParam="0" sParam="" />
    <Action type="0" message="2326" wParam="0" lParam="0" sParam="" />
    <!-- Select all text before running cleanup -->
    <Action type="0" message="2013" wParam="0" lParam="0" sParam="" />
    <!-- End Scintilla HTML-paste workaround top portion -->

    <!-- BEGIN CONVERSION BLOCKS -->

    <!-- U+2060 is driving me nuts so I'm making it the 1st block -->
    <!-- U+2060 must be placed above the apostrophe-related blocks -->
    <!-- Otherwise apostrophe block may skip over it -->
    <!-- U+2060 is disruptive as it must be placed above zero-width too -->
    <!-- Replace U+2060 (WORD JOINER) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2060;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #1. CONTROL CHARACTERS (remove first) -->
    <!-- Replace U+000F (SHIFT-OUT control character) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x000F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0001 (SOH control character) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0001;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #2. DASHES & MINUS SIGNS (safest to remove early) -->
    <!-- Replace U+2010 (HYPHEN) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2010;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2011 (NON-BREAKING HYPHEN) with ASCII hyphen "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2011;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2012 (FIGURE DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2012;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2212 (MINUS SIGN) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2212;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #3. ZERO-WIDTH CHARACTERS (must be BEFORE apostrophes) -->

    <!-- Replace U+200C (ZERO WIDTH NON-JOINER) with "" (nothing) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200B (ZERO WIDTH SPACE) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200D (ZERO WIDTH JOINER) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+FEFF (BOM) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xFEFF;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #4. SPECIAL SPACES (convert to ASCII space) -->
    <!-- Replace U+00A0 (NO-BREAK SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00A0;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2007 (FIGURE SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2007;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+202F (NARROW NO-BREAK SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x202F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200A (HAIR SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200A;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2008 (PUNCTUATION SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2008;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2006 (SIX-PER-EM SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2006;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #5. APOSTROPHE-LIKE CHARACTERS -->
    <!-- Replace U+0F0C (TIBETAN MARK DELIMITER) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0F0C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2018 (LEFT SINGLE QUOTE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2018;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2019 (RIGHT SINGLE QUOTATION) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2019;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2032 (PRIME) with ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2032;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BC (MODIFIER LETTER APOSTROPHE) with ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BC;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02B9 (MODIFIER LETTER PRIME) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02B9;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02C8 (MODIFIER LETTER VERTICAL) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02C8;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02EE (MODIFIER DOUBLE APOSTROPHE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02EE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- U+201B (SINGLE HIGH-REVERSED-9 QUOTATION MARK) with apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02CB (MODIFIER LETTER GRAVE ACCENT) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02CB;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- This is is a duplication which is after the apostrophes -->
    <!-- When U+200B appears between two characters that were already replaced -->
    <!-- the first pass fails to remove it, so I added this duplicate -->
    <!-- Remove U+200B (ZERO-WIDTH SPACE) second pass -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #6. COMBINING MARKS (remove only after apostrophes are done) -->
    <!-- Remove U+0351 (COMBINING RIGHT HALF RING ABOVE) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0351;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0307 (COMBINING DOT ABOVE) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0307;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0331 (COMBINING MACRON BELOW) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0331;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #7. DOUBLE-QUOTE NORMALIZATION -->
    <!-- Replace U+201C (LEFT DOUBLE QUOTE) with ASCII double quote " -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam='&quot;' />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201D (RIGHT DOUBLE QUOTE) with ASCII double quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam='&quot;' />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #8. ELLIPSIS, EM DASH, EN DASH, HTML ENTITIES -->
    <!-- Replace U+2026 (HORIZONTAL ELLIPSIS) with ASCII "..." -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2026;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="..." />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace literal &#151; (HTML entity for EM DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&amp;#151;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2014 (EM DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2014;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2013 (EN DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2013;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace literal &zwnj; (ZERO WIDTH NON-JOINER entity) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&zwnj;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #9. BULLETS, MATH SYMBOLS, LETTERS WITH DIACRITICS -->
    <!-- Replace U+2022 (BULLET) with ASCII "*" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2022;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&#x002A;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+8722 (MATHEMATICAL MINUS variant) with ASCII "&" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x8722;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&amp;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+011F (LATIN SMALL G WITH BREVE) with ASCII "g" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x11f;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="g" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+00E1 (LATIN SMALL A WITH ACUTE) with ASCII "a" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xe1;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="a" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0161 (LATIN SMALL S WITH CARON) with ASCII "s" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x161;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="s" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+011B (LATIN SMALL E WITH CARON) with ASCII "e" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x11b;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="e" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #10. MISCELLANEOUS SYMBOLS -->
    <!-- Replace U+2713 (CHECK MARK) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2713;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace ASCII hyphen "-" with ASCII hyphen "-" (normalize) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace backtick with ASCII single quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="`" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Unicode Arrow (U+2192) with ASCII dash greaterthan -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2192;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="->" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace degree symbol with deg -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00B0;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="deg" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace copyright symbol U??? with (C) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00A9;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(C)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Trademark (U+2122) with (TM) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2122;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(TM)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Registered (U+00AE) with (R) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00AE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(R)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #11. INVISIBLE OPERATORS (remove) -->
    <!-- Replace U+00AD (SOFT HYPHEN) with "" (remove completely) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00AD;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2061 (FUNCTION APPLICATION) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2061;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2062 (INVISIBLE TIMES) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2062;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2063 (INVISIBLE SEPARATOR) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2063;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2064 (INVISIBLE PLUS) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2064;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+180E (MONGOLIAN VOWEL SEPARATOR) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x180E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #12. LINE SEPARATORS -->
    <!-- Replace U+2028 (LINE SEPARATOR) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2028;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2029 (PARAGRAPH SEPARATOR) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2029;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0085 (NEXT LINE / NEL) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0085;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+A78C (LATIN SMALL LETTER SALTILLO) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xA78C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+FF07 (FULLWIDTH APOSTROPHE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xFF07;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0335 (COMBINING SHORT STROKE OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0335;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0336 (COMBINING LONG STROKE OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0336;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0337 (COMBINING SHORT SOLIDUS OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0337;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0338 (COMBINING LONG SOLIDUS OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0338;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2043 (HYPHEN BULLET) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2043;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BE (MODIFIER LETTER RIGHT HALF RING) w/ ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BF (MODIFIER LETTER LEFT HALF RING) w/ ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BF;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201E (DOUBLE LOW-9 QUOTATION MARK) w/ ASCII double quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201F (DOUBLE HIGH-REVERSED-9 QUOTATION MARK) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+275D (HEAVY DOUBLE QUOTATION MARK ORNAMENT LEFT) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x275D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+275E (HEAVY DOUBLE QUOTATION MARK ORNAMENT RIGHT) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x275E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2015 (HORIZONTAL BAR) with ASCII hyphen -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2015;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2009 (THIN SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2009;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2009 (THIN SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2009;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- END OF CONVERSION BLOCKS -->

    <!-- Begin Scintilla HTML-paste workaround bottom portion -->
    <!-- Select all cleaned text -->
    <Action type="0" message="2013" wParam="0" lParam="0" sParam="" />

    <!-- Cut cleaned text to Windows clipboard -->
    <Action type="0" message="2177" wParam="0" lParam="0" sParam="" />
    <!-- End Scintilla HTML-paste workaround bottom portion -->

    <!-- Notepad++ will save shortcuts.xml automatically as it
    rewrites the file whenever shortcuts/macros/plugins change.
    Some sections are required so empty sections will be recreated.
    -->

    </Macro>
    </Macros>
    <UserDefinedCommands>
    </UserDefinedCommands>

    <PluginCommands />
    <ScintillaKeys />

    </NotepadPlus>
    --
    The problem with computers is that humans wrote all the programs.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hank Rogers@Hank@nospam.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 17:48:30 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote on 2/11/2026 5:37 PM:
    Maria Sophia wrote:
    I have to force Notepad++ to rebuild the macro action list.
    1. Open Notepad++ without shortcuts.xml in any editor.
    2. Go to Macroa ->a Modify Shortcut / Delete Macro
    3. Find the macro in the list & select it.
    4. Click "Delete" & then close Notepad++ completely 5. Open
    shortcuts.xml in gVim (or any editor that is NOT Notepad++).
    6. Paste the original macro back into shortcuts.xml.
    aa Make sure the U+2060 removal block is the very first Action
    aa in the entire macro, before any other Action.
    7. Save the file making sure nothing else related is running.
    8. Start Notepad++ again and re-run the grueling test.

    OMG. Give me a gun. I want to shoot Scintilla's developers! :)
    (just kidding)

    But I am frustrated... as they wasted my valuable time.
    I finally figured out WHY Notepad++ wasn't running the macros in the
    order I wrote them.
    I had to FORCE Notepad++ to rebuilt the entire macro from scratch!
    Then it worked fine in the same testcase it's been failing on.

    Jesus Christ. That's NOT intuitive. It's only intuitive AFTER you
    realize Notepad++ is rewriting
    the shortcuts.xml (keeping only your comments of what you add).

    Who knew?
    Not me. Now I do!

    Notepad++ does NOT execute macros directly from shortcuts.xml.
    Once I forced it to read macros from shortcuts.xml... Apostrophes are working.
    Zero-width characters too.
    Combining marks also.
    Double quotes, dashes, symbols and diacritics are working.
    So are invisible operators.
    Soft hyphen & line separators
    Even the Multi-word stress test is working.
    As is the mixed-chaos jumbled test.

    And yet, I changed nothing in the shortcuts.xml file!
    Jesus Christ.
    There was nothing wrong with the programming on my side.
    It was all because Notepad++ doesn't do what we think it does.

    Sheesh. Drives me nuts when I can't solve a problem.
    As I almost never fail - so it was driving me nuts.

    Mainly because I didn't understand what was happening.
    My "logic" was fine (as I'm extremely logical).

    It was simply that I was blindsided by Notepad++ changing the order.
    Without telling me it changed the order.

    By deleting the macro, the macro was finally rebuilt internally.
    Notepad++ finally loaded the new action order
    U+2060 finally executed first!

    When Notepad++ starts, it reads shortcuts.xml ONCE.
    It loads all macros into an INTERNAL ARRAY in memory.
    After that, the XML file is ignored.

    Only when we edit shortcuts.xml while Notepad++ is closed,
    does the next startup load the new version... but... but...
    But...
    If the XML structure is malformed,
    or the macro is outside the <Macros> block,
    or the <Macros> block is empty,
    or Notepad++ rewrites the file,
    or the macro name changes,
    or the macro is deleted and not recreated,

    then Notepad++ will:

    silently discard the macro,
    rebuild the XML in its own structure,
    and load the LAST VALID INTERNAL VERSION it had.

    This looks like"Notepad++ is using an old cache.
    But it's not a cache.
    It's the internal macro array

    Deleting the macro inside Notepad++ clears the internal array.

    I'm out of energy for today, but I will try to build a version 4.0
    (v4p0) that Notepad++ cannot reorder internally.

    Thats a damn shame maria.


    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hank Rogers@Hank@nospam.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 18:26:04 2026
    From Newsgroup: alt.comp.os.windows-10

    Hank Rogers wrote on 2/11/2026 5:48 PM:
    Maria Sophia wrote on 2/11/2026 5:37 PM:
    Maria Sophia wrote:
    I have to force Notepad++ to rebuild the macro action list.
    1. Open Notepad++ without shortcuts.xml in any editor.
    2. Go to Macroa ->a Modify Shortcut / Delete Macro
    3. Find the macro in the list & select it.
    4. Click "Delete" & then close Notepad++ completely 5. Open
    shortcuts.xml in gVim (or any editor that is NOT Notepad++).
    6. Paste the original macro back into shortcuts.xml.
    aa Make sure the U+2060 removal block is the very first Action
    aa in the entire macro, before any other Action.
    7. Save the file making sure nothing else related is running.
    8. Start Notepad++ again and re-run the grueling test.

    OMG. Give me a gun. I want to shoot Scintilla's developers! :)
    (just kidding)

    But I am frustrated... as they wasted my valuable time.
    I finally figured out WHY Notepad++ wasn't running the macros in the
    order I wrote them.
    I had to FORCE Notepad++ to rebuilt the entire macro from scratch!
    Then it worked fine in the same testcase it's been failing on.

    Jesus Christ. That's NOT intuitive. It's only intuitive AFTER you
    realize Notepad++ is rewriting
    the shortcuts.xml (keeping only your comments of what you add).

    Who knew?
    Not me. Now I do!

    Notepad++ does NOT execute macros directly from shortcuts.xml.
    Once I forced it to read macros from shortcuts.xml... Apostrophes are
    working.
    Zero-width characters too.
    Combining marks also.
    Double quotes, dashes, symbols and diacritics are working.
    So are invisible operators.
    Soft hyphen & line separators
    Even the Multi-word stress test is working.
    As is the mixed-chaos jumbled test.

    And yet, I changed nothing in the shortcuts.xml file!
    Jesus Christ.
    There was nothing wrong with the programming on my side.
    It was all because Notepad++ doesn't do what we think it does.

    Sheesh. Drives me nuts when I can't solve a problem.
    As I almost never fail - so it was driving me nuts.

    Mainly because I didn't understand what was happening.
    My "logic" was fine (as I'm extremely logical).

    It was simply that I was blindsided by Notepad++ changing the order.
    Without telling me it changed the order.

    By deleting the macro, the macro was finally rebuilt internally.
    Notepad++ finally loaded the new action order
    U+2060 finally executed first!

    When Notepad++ starts, it reads shortcuts.xml ONCE.
    It loads all macros into an INTERNAL ARRAY in memory.
    After that, the XML file is ignored.

    Only when we edit shortcuts.xml while Notepad++ is closed,
    does the next startup load the new version... but... but...
    But...
    If the XML structure is malformed,
    or the macro is outside the <Macros> block,
    or the <Macros> block is empty,
    or Notepad++ rewrites the file,
    or the macro name changes,
    or the macro is deleted and not recreated,

    then Notepad++ will:

    silently discard the macro,
    rebuild the XML in its own structure,
    and load the LAST VALID INTERNAL VERSION it had.

    This looks like"Notepad++ is using an old cache.
    But it's not a cache.
    It's the internal macro array

    Deleting the macro inside Notepad++ clears the internal array.

    I'm out of energy for today, but I will try to build a version 4.0
    (v4p0) that Notepad++ cannot reorder internally.

    Thats a damn shame maria.




    Say Maria, you really work hard on this. Is there a way we can pay to
    keep your shit going? Do you have a paypal account we can send money
    to? I realize you like to operate as a secret agent using super
    privacy, so it's probably never possible. Just asking anyway.

    You are very secretive, like mission impossible ... Makes it very
    exciting for us normal people who are not double naught spies like you!

    Be careful out there!

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Wed Feb 11 20:05:17 2026
    From Newsgroup: alt.comp.os.windows-10

    For those who wish to learn more about this HTML fragment problem,
    Microsoft outlines a few instructive scenarios in this learning document:
    *HTML Clipboard Format*
    <https://learn.microsoft.com/en-us/windows/win32/dataxchg/html-clipboard-format>
    "The CF_HTML clipboard format allows a fragment of raw HTML text and its
    context (i.e. outer HTML) to be stored on the clipboard as ASCII."

    It's not just Chromium copypasta that has this problem as even microsoft
    excel has been affected by a confirmed bug in Microsoft's WPF framework
    where HTML fragments copied to the clipboard have incorrect offsets,
    causing Excel and other apps to reject or mis-handle the paste.
    <https://github.com/dotnet/wpf/issues/10476>

    But most of the HTML fragment issues I've unearthed by studying this
    phenomenon seems to be related tho Chromium browsers (Chrome, Edge, Brave, etc.) which wrap HTML clipboard content in extra <html> and <body> tags
    when writing to the clipboard.
    <https://issues.chromium.org/issues/328477621>
    --
    Sometimes, the simplest answer is only simple only after you know it.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Daniel70@daniel47@nomail.afraid.org to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Thu Feb 12 20:19:12 2026
    From Newsgroup: alt.comp.os.windows-10

    On 12/02/2026 11:26 am, Hank Rogers wrote:

    <Snip>

    Say Maria, you really work hard on this.a Is there a way we can pay to
    keep your shit going?a Do you have a paypal account we can send money
    to?a I realize you like to operate as a secret agent using super
    privacy, so it's probably never possible.a Just asking anyway.

    You are very secretive, like mission impossiblea ... Makes it very
    exciting for us normal people who are not double naught spies like you!

    Be careful out there!

    I wish He/She/It WOULD keep His/Her identity SECRET by NOT changing it
    every six - twelve months to beat other peoples (ME) trying to filter
    those post to the big bit bucket!!
    --
    Daniel70
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Thu Feb 12 14:49:33 2026
    From Newsgroup: alt.comp.os.windows-10

    Doing my part to ignore insults from those who can never add value,
    but who feel desperate to post something (anything!) for some odd reason,
    the summary below explains (to the best of my knowledge) what happened.

    The problem has been solved (see the recent detailed Notepad++ macro)
    but this article below attempts to explain why Firefox doesn't cause this.

    Only Chromium.

    Hmmm....

    I wondered why this problem of pasting into Notepad++ (which I've fixed
    using a macro that adds a space & then removes it) doesn't happen with my Firefox pastes. It only seems to happen with my Chromium pastes.

    As Carlos, Andy & Paul astutely and helpfully noted, apparently Microsoft systems place HTML fragments into the clipboard simply because the Windows clipboard architecture is designed to support multiple parallel
    data formats for a single copy operation. Hence, modern applications such
    as Chromium use this capability to provide rich content to any target
    program that can consume it (Notepad++ not being one of them, I guess).

    It was known to Paul, Andy & Carlos, but not to me, that when Chromium
    copies a selection, it generates both a plain text stream and an HTML
    Fragment block that follows the Microsoft HTML Clipboard Format
    specification.

    This specification requires StartHTML, EndHTML, StartFragment, and
    EndFragment offsets so that applications can extract only the visible
    portion of the Document Object Model (DOM).

    What's a DOM? I don't/didn't know, but I looked it up and it seems to be
    the word they use for the internal tree structure that a browser builds
    after it parses an HTML page. So why Chromium and not Firefox then?

    It turns out that Chromium and Firefox handle clipboard HTML in different
    ways apparently because they were built on different internal models for selection, rendering and data transfer. The relevant point here is that Chromium always generates an HTML Fragment block when copying from a web
    page because its editing and selection subsystem is based on the WebKit and Blink design, which treats every selection as a range of DOM nodes that can
    be serialized into both plain text and HTML.

    This behavior was inherited from the original WebKit clipboard code and
    was kept for compatibility with Windows applications that expect rich
    HTML on the clipboard.

    On the other hand, Firefox uses a different clipboard pipeline that
    was originally built around XUL and the Gecko editor and it only emits
    HTML Format when the selection contains markup that Firefox considers meaningful. Hence, as this problem only happens when I copy from
    Chromium-based browsers, in most cases Firefox emits only plain text
    because its selection serializer is more conservative and does not always generate a full HTML Fragment block.

    On the other hand, Chromium always emits HTML Format because its design
    goal is to maximize fidelity when pasting into applications like Word or Outlook, while Firefox focuses on correctness and minimal output.

    Who knew?
    Not me.
    Now I do.

    As a result, Chromium produces these problematic HTML fragments far more
    often than Firefox does, even when the user sees no visible formatting.
    --
    On Usenet, people help others out of their kindness and generosity.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Carlos E. R.@robin_listas@es.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Thu Feb 12 21:03:10 2026
    From Newsgroup: alt.comp.os.windows-10

    On 2026-02-12 20:49, Maria Sophia wrote:
    Doing my part to ignore insults from those who can never add value,
    but who feel desperate to post something (anything!) for some odd reason,
    the summary below explains (to the best of my knowledge) what happened.

    Arlen, he is not insulting you.

    You make a choice to change your name frequently. It is your choice, but
    it is a fact that this is bad manners towards us. You do not like we
    tell you this, sure, but we are not insulting you. We are just stating a
    fact that you don't like.

    ...
    --
    Cheers,
    Carlos E.R.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hank Rogers@Hank@nospam.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Thu Feb 12 14:25:15 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote on 2/12/2026 2:03 PM:
    On 2026-02-12 20:49, Maria Sophia wrote:
    Doing my part to ignore insults from those who can never add value,
    but who feel desperate to post something (anything!) for some odd reason,
    the summary below explains (to the best of my knowledge) what happened.

    Arlen, he is not insulting you.

    You make a choice to change your name frequently. It is your choice, but
    it is a fact that this is bad manners towards us. You do not like we
    tell you this, sure, but we are not insulting you. We are just stating a fact that you don't like.

    ...


    I believe he is psychotic.


    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Thu Feb 12 15:37:24 2026
    From Newsgroup: alt.comp.os.windows-10

    Hank Rogers wrote:
    I believe he is psychotic.

    Privacy is always most deprecated by those who least understand it, but I
    will continue to ignore the insults which took them 3 seconds to compose. However, I would like to ask those who continue to post off topic trolls to please summarize what the problem set is about in this thread topic so that
    we can all be edified as to what they think of the proposed solution set.

    Since I strive to add value in every post, even when responding to the
    trolls, I wrote this to edify the Firefox group on all the major platforms, while Hank and Carlos were spending about 3 seconds of their valuable time composing off-topic trolls that have nothing to do with the topic.

    I simply ask the trolls to respond to this post in an on-topic way
    that adds value (which means they need to invest their intelligence).

    Newsgroups: alt.comp.software.firefox,comp.sys.mac.system,alt.os.linux
    Subject: PSA: Clipboard differences between Chromium & Firefox across
    platforms
    Date: Thu, 12 Feb 2026 15:26:32 -0500
    Organization: BWH Usenet Archive (https://usenet.blueworldhosting.com) Message-ID: <10mld1o$1910$1@nnrp.usenet.blueworldhosting.com>

    PSA: Clipboard differences between Chromium & Firefox across platforms

    I do a lot of research as I generally invest at least an hour or two into
    many of my Usenet opening posts, where I currently employ a thousand-line Windows Notepad++ macro that beautifully cleans up non-ASCII garbage copied from both Firefox and Chromium web output, where, only with Chromium pastes into Notepad++ was the selection mechanism (i.e., Ctrl+A) inoperative.
    Newsgroups: alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows
    Subject: PSA: HTML fragment mode interaction between Chromium, Clipboard & Notepad++
    Date: Tue, 10 Feb 2026 07:17:37 -0500
    Message-ID: <10mf7l0$dbf$1@nnrp.usenet.blueworldhosting.com>

    After much effort, I was able to resolve the problem in one programmed
    stroke (as all my solutions are one-tap solutions) but what I wanted to
    remind those on the Firefox newsgroup is WHY only Chromium. Not Firefox.

    It goes under a bunch of names when I researched it, but they're the same.
    1. The HTML Fragment Clipboard Format issue
    2. The StartHTML offset bug
    3. The Windows HTML Clipboard Format quirk
    4. The Chromium HTML clipboard serialization problem
    5. The multi-format clipboard mismatch issue
    etc.

    The HTML Fragment issue showed up for me on Windows when pasting into
    Notepad++ from Chromium, but the underlying cause applies to all platforms because it apparently comes from how Chromium generates clipboard data,,
    which is DIFFERENT from how Firefox does the same copy/paste tasks.

    Even as Firefox uses a different and more conservative clipboard path, the problem seemed almost random because the editor is intimately involved.

    The important point is that not all editors behave the same way when they receive multiple clipboard formats. Some editors request only plain text
    from clipboard pastes, while some request HTML Format, and some request
    both and then choose one based on the editor's own internal rules.

    On Windows, Notepad++ requests the plain text stream, but the presence of
    the HTML Fragment block can influence how the plain text stream is
    generated or parsed.

    Other editors on other platforms may ignore the HTML Fragment block or
    may sanitize the plain text stream differently. This means the issue is
    not tied to one editor. It depends on how each editor interacts with the clipboard and how it handles the plain text stream when HTML Format is
    also present.

    Unfortunately for me, Chromium always emits HTML Format, so any editor on
    any platform that does not expect it or that parses the plain text stream
    in a strict way can show the odd behavior which tripped up my one-tap Unicode-to-ASCII conversion solutions. Copying from Firefox avoids this
    because FF apparently emits HTML Format only when needed, so the plain text stream is simpler and more predictable across editors and platforms.

    That's why I didn't see this problem when I used Firefox for my researched copy/pasting. Only a paste from Chromium browsers broke the Control+A key
    press (which also broke the left-mouse-sweep selection workaround).

    Specifically, since I edit in GVim but I convert Unicode-to-ASCII in
    Notepad, this HTML Fragment issue showed up for me on Windows when pasting
    into Notepad++, where NOTHING VISIBLE could be found for a land mine.

    The land mine being invisible is important to note because I originally
    went down character set and invisible character rat holes, but Notepad++'s
    hex editor showed absolutely nothing there causing the problem.

    It's an invisible land mine.

    I'm writing this PSA to help others because the underlying cause applies to
    all platforms (and some editors) because it comes from the design of the Chromium clipboard pipeline, not only from Notepad++ or Windows alone.

    To delve a bit deeper into the cause of this invisible land mine,
    apparently Chromium always places multiple formats on the clipboard when we copy from a web page. This includes plain text, HTML Format, and several internal formats. The HTML Format block follows a Microsoft specification
    that uses StartHTML, EndHTML, StartFragment, and EndFragment offsets to
    mark the visible part of the selection.

    While I have not tested the other platforms, apparently Chromium does this
    on all platforms because its selection code comes from the WebKit and Blink model, which treats every selection as a range of DOM nodes that can be serialized into both plain text and HTML.

    Firefox takes a different approach.

    Mozilla's clipboard pipeline was apparently built around the Gecko editor
    and it only emits HTML Format when the selection contains markup that
    Firefox considers meaningful. In many cases Firefox emits only plain text. Perhaps this is why pasting from Firefox did not trigger the same behavior
    that I saw when pasting from Chromium based browsers.

    The practical effect for users on all common consumer operating systems is
    that Chromium produces HTML fragments far more often than Firefox does,
    even when the user sees no visible formatting.

    On Windows this can expose quirks in applications that do not expect
    HTML Format or that handle the plain text stream differently when HTML
    is also present. On Linux and macOS the details differ, but the general
    pattern is the same because the behavior comes from Chromium, not from
    the operating system.

    This PSA is only a heads up. Firefox is not at fault here. It simply
    uses a more conservative clipboard serializer. Chromium uses a richer
    one. If you ever see odd paste behavior in a text editor, the difference
    in clipboard formats may be the reason, which is why I wrote this up.

    Had I known about this months ago, I wouldn't have had this problem that I
    had to add an extra step of adding a blank line in every Chromium paste,
    but I didn't have to add that blank line in every Firefox paste.

    Adding a blank line, to me, is anathema, because it's an extra step.
    I hate extra steps. I grew up in DEC/VAX/PDP11 days and I learned early on
    when burning the 2Kbit EEPROMs for Motorola MC68701 micro controllers by
    hex (not assembly, but hex) that everything can be halved in steps.

    Unlike quantum physics though, I stop when I get it to be one step.

    In short, the invisible land mine I hit came from the way Chromium
    generates clipboard data differently than Firefox does.

    Chromium places multiple formats on the clipboard on all platforms, not
    only on Windows.

    This behavior is part of the Chromium clipboard design itself.

    The exact formats differ by operating system because each OS has
    its own clipboard API, but the pattern is the same.

    Chromium always serializes the selection into plain text, HTML Format, and several internal formats. On Windows this shows up as CF_UNICODETEXT and
    HTML Format with StartHTML and StartFragment offsets. On macOS it shows up
    as NSPasteboard types for plain text and HTML. On Linux it shows up as
    multiple MIME types on the X11 or Wayland clipboard.

    The important point is that Chromium always provides HTML along with plain text, while Firefox provides HTML only when needed. This is why the issue
    can appear on any platform that uses Chromium and any editor that reacts to
    the presence of HTML on the clipboard.

    The invisible land mine issue I had to solve yesterday comes from the
    design of the Chromium clipboard pipeline, and any editor that handles the plain text stream in a strict way can be tripped up by the extra HTML
    Fragment data that Chromium always provides.

    I hope this PSA saves you looking in all the wrong places when the solution
    was as simple as automatically adding & deleting a space at the very
    beginning of my thousand-line Notepad++ macro to normalize pasted text.

    If this edifies you, then the time invested in writing it was worthwhile.
    --
    Had I known how it works, I would have written up a tutorial instead since
    I'm a rare breed of person who delights in edifying everyone around me.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Daniel70@daniel47@nomail.afraid.org to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 19:23:03 2026
    From Newsgroup: alt.comp.os.windows-10

    On 13/02/2026 6:49 am, Maria Sophia wrote:
    Doing my part to ignore insults from those who can never add value,
    but who feel desperate to post something (anything!) for some odd reason,
    the summary below explains (to the best of my knowledge) what happened.

    SORRY!!

    Quoting me ....
    I wish He/She/It WOULD keep His/Her identity SECRET by NOT changing it
    every six - twelve months to beat other peoples (ME) trying to filter
    those post to the big bit bucket!!
    End Quote

    Where did I insult you, .... all I did say was that I wished YOU
    wouldn't keep changing your NYM ... thereby routing other posters
    attempts to block you.

    Am I wrong?? Why DO you change YOUR Nym sooooo often??
    --
    Daniel70
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 04:05:08 2026
    From Newsgroup: alt.comp.os.windows-10

    Daniel70 wrote:
    Where did I insult you

    Before we answer that question for you, please let us know what you learned from this thread, and let us know what you would do to improve the details.

    Specifically if you can improve this Notepad++ shortcuts.xml file, that
    would be a good use of your intelligence if you're so willing to share.

    Particularly difficult is the method to get Notepad++ to close the file
    after the control+B from within the macro, and, also what hasn't been
    solved yet that you can help with since you're so eager to post to this thread, is how to automatically save from the macro the original file.

    Both of those are important steps which need to be added to this macro,
    where since you're so eager to add on-topic value, that's where your vast intelligence will be best employed to help others use this file too!

    If you have any questions about this proposed solution, please just ask.

    <<?xml version="1.0" encoding="UTF-8" ?>
    <!-- C:\app\editor\txt\N++\shortcuts.xml for Windows Notepad++ (N++) -->
    <!-- Automatically cleans fragments, converts to ASCII & copies to clipbrd -->
    <!-- Use model: Control+V (paste) & Control+B (run the macro) -->
    <!-- v3p9 20260211 N++ was not running the macro in the order shown -->
    <!-- But it turned out any error causes an OLDER version to run. -->
    <!-- Worse, when that happens, N++ overwrites this file -->
    <!-- Worse, N++ is executing macro actions in a different order -->
    <!-- than they appear in the XML so a total rewrite is needed in v4p0 -->
    <!-- v3p8 20260211 U+2060 is driving me nuts so it's the first block now -->
    <!-- v3p7 20260211 moved U+2060 up because it's the most disruptive -->
    <!-- v3p6 20260211 U+2009 & U+200B not being converted properly -->
    <!-- v3p5 20260211 fixed U+200B failing when U+200B is between ' & s -->
    <!-- A 2nd pass was duplicated after apostrophe normalization rules -->
    <!-- v3p4 20260211 added U+275E (heavy double quote right) -->
    <!-- v3p3 20260211 added U+2009 (thin space) -->
    <!-- v3p2 20260211 added seven new conversions after running testcases -->
    <!-- U+02BE (modifier letter right half ring) -->
    <!-- U+02BF (modifier letter left half ring) -->
    <!-- U+201E (double low-9 quote) -->
    <!-- U+201F (double high-reversed-9 quote) -->
    <!-- U+275D (heavy double quote left) -->
    <!-- U+275E (heavy double quote right) -->
    <!-- U+2015 (horizontal bar) -->
    <!-- U+2009 (thin space) -->
    <!-- v3p1 20260211 reorganized into a dozen distinct categories -->
    <!-- (1) control characters: U+000F U+0001 -->
    <!-- (2) dashes & minus signs: U+2010 U+2011 U+2012 U+2212 -->
    <!-- (3) zero-width characters: U+200C U+200B U+200D U+FEFF U+2060 -->
    <!-- (4) special spaces: U+00A0 U+2007 U+202F U+200A U+2008 U+2006 -->
    <!-- (5) apostrophe-like characters:
    U+0F0C U+2018 U+2019 U+2032 U+02BC U+02B9 U+02C8 U+02EE
    U+201B U+02CB U+A78C U+FF07 -->
    <!-- (6) combining marks (remove after apostrophes):
    U+0351 U+0307 U+0331 U+0335 U+0336 U+0337 U+0338 -->
    <!-- (7) double-quote normalization: U+201C U+201D -->
    <!-- (8) dash-like & ellipsis & HTML entities:
    U+2026 &#151; U+2014 U+2013 &zwnj; -->
    <!-- (9) bullets, math symbols, diacritics:
    U+2022 U+8722 U+011F U+2009 U+00E1 U+0161 U+011B -->
    <!-- (10) miscellaneous symbols:
    U+2713 ASCII hyphen ` U+2192 U+00B0 U+00A9 U+2122 U+00AE -->
    <!-- (11) invisible operators:
    U+00AD U+2061 U+2062 U+2063 U+2064 U+180E -->
    <!-- (12) line separators: U+2028 U+2029 U+0085 -->
    <!-- v3p0 20260211 added combining marks U+0351 U+0307 U+0331 -->
    <!-- v3p1 20260211 added apostrophe-like characters U+201B U+02CB -->
    <!-- v2p9 20260211 moved U+2060 to be above apostrophe-related blocks -->
    <!-- v2p8 20260211 fixed Chromium CF_HTML paste control+A anomaly -->
    <!-- v2p7 20260211 added U+02EE modifier letter double apostrophe rule -->
    <!-- v2p6 20260211 fixed U+02C8 modifier letter vertical line) rule -->
    <!-- v2p5 20260211 fixed U+02B9 (modifier letter prime) rule -->
    <!-- v2p4 20260211 removed one of two U+000F blocks -->
    <!-- v2p3 20260211 removed two (duplicate) 1700 lines in U+0161 -->
    <!-- v2p2 20260211 fixed all zero-width blocks to replace with nothing -->
    <!-- v2p1 20260211 fixed BOM to replace with nothing -->
    <!-- v2p0 20260210 cleaned (emptied out) closing sections of the file -->
    <!-- v1p9 20260210 ported old shortcuts.xml to improve coverage -->
    <!-- Cleans Chromium pasted text & normalizes Unicode to ASCII -->
    <!-- Use model: paste (using control+v) & fix (using control+b) -->
    <!-- The macro should 1st break CF_HTML fragment mode (so Ctrl+A works) -->
    <!-- and then run the Unicode-to-ASCII cleanup on all the pasted text -->
    <!-- cutting (control+x) the result back into the Windows clipboard -->
    <!-- thereby leaving the N++ GUI empty & ready for the next paste-->

    <!--
    To break Scintilla's CF_HTML fragment mode, we need to make any edit.
    We can insert a space & then delete that space, for example.


    <!-- Scintilla engine command meanings:
    1700 = begin a new search/replace operation
    1601 = set the search string (the Unicode character to find)
    1625 = clear the replacement buffer
    1602 = set the replacement string (ASCII equivalent)
    1702 = execute Replace All
    1701 = end this search/replace block
    2001 = SCI_REPLACESEL (inserts a space)
    2326 = SCI_DELETEBACK (deletes the space)
    2013 = SCI_SELECTALL (selects everything)
    2177 = SCI_CUT (cut all)
    41001 = IDM_FILE_EXIT (close)


    <!-- When you paste from a Chromium-based app, the clipboard contains:
    CF_UNICODETEXT (plain text) & CF_HTML (HTML fragment)
    And sometimes CF_RTF where N++ prefers CF_HTML if available.

    v2p0 fixes a N++ selection issue caused by CF_HTML pastes.
    "HTML Paste Mode" prevents the "Control+A" from working.
    "HTML paste mode" inserts HTML fragment as plain text
    where Ctrl+A is disabled until the buffer is "normalized"
    (until the first edit that breaks the fragment state)


    <!-- Below is garbage that N++ adds to shortcuts.xml -->
    <NotepadPlus>
    <InternalCommands>
    <Shortcut id="43009" Ctrl="no" Alt="no" Shift="no" Key="0" />
    </InternalCommands>
    <Macros>
    <!-- Above is garbage that N++ adds to shortcuts.xml -->

    <!-- ASCII "control+b" Cleanup Macro -->
    <Macro name="ASCII" Ctrl="yes" Alt="no" Shift="no" Key="66">

    <!-- Begin Scintilla HTML-paste workaround top portion -->
    <!-- Break Chromium CF_HTML fragment mode by adding & deleting a space-->
    <Action type="0" message="2001" wParam="32" lParam="0" sParam="" />
    <Action type="0" message="2326" wParam="0" lParam="0" sParam="" />
    <!-- Select all text before running cleanup -->
    <Action type="0" message="2013" wParam="0" lParam="0" sParam="" />
    <!-- End Scintilla HTML-paste workaround top portion -->

    <!-- BEGIN CONVERSION BLOCKS -->

    <!-- U+2060 is driving me nuts so I'm making it the 1st block -->
    <!-- U+2060 must be placed above the apostrophe-related blocks -->
    <!-- Otherwise apostrophe block may skip over it -->
    <!-- U+2060 is disruptive as it must be placed above zero-width too -->
    <!-- Replace U+2060 (WORD JOINER) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2060;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #1. CONTROL CHARACTERS (remove first) -->
    <!-- Replace U+000F (SHIFT-OUT control character) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x000F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0001 (SOH control character) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0001;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #2. DASHES & MINUS SIGNS (safest to remove early) -->
    <!-- Replace U+2010 (HYPHEN) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2010;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2011 (NON-BREAKING HYPHEN) with ASCII hyphen "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2011;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2012 (FIGURE DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2012;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2212 (MINUS SIGN) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2212;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #3. ZERO-WIDTH CHARACTERS (must be BEFORE apostrophes) -->

    <!-- Replace U+200C (ZERO WIDTH NON-JOINER) with "" (nothing) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200B (ZERO WIDTH SPACE) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200D (ZERO WIDTH JOINER) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+FEFF (BOM) with nothing -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xFEFF;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #4. SPECIAL SPACES (convert to ASCII space) -->
    <!-- Replace U+00A0 (NO-BREAK SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00A0;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2007 (FIGURE SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2007;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+202F (NARROW NO-BREAK SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x202F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+200A (HAIR SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200A;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2008 (PUNCTUATION SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2008;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2006 (SIX-PER-EM SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2006;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #5. APOSTROPHE-LIKE CHARACTERS -->
    <!-- Replace U+0F0C (TIBETAN MARK DELIMITER) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0F0C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2018 (LEFT SINGLE QUOTE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2018;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2019 (RIGHT SINGLE QUOTATION) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2019;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2032 (PRIME) with ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2032;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BC (MODIFIER LETTER APOSTROPHE) with ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BC;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02B9 (MODIFIER LETTER PRIME) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02B9;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02C8 (MODIFIER LETTER VERTICAL) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02C8;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02EE (MODIFIER DOUBLE APOSTROPHE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02EE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- U+201B (SINGLE HIGH-REVERSED-9 QUOTATION MARK) with apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02CB (MODIFIER LETTER GRAVE ACCENT) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02CB;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- This is is a duplication which is after the apostrophes -->
    <!-- When U+200B appears between two characters that were already replaced -->
    <!-- the first pass fails to remove it, so I added this duplicate -->
    <!-- Remove U+200B (ZERO-WIDTH SPACE) second pass -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x200B;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #6. COMBINING MARKS (remove only after apostrophes are done) -->
    <!-- Remove U+0351 (COMBINING RIGHT HALF RING ABOVE) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0351;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0307 (COMBINING DOT ABOVE) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0307;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0331 (COMBINING MACRON BELOW) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0331;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #7. DOUBLE-QUOTE NORMALIZATION -->
    <!-- Replace U+201C (LEFT DOUBLE QUOTE) with ASCII double quote " -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam='&quot;' />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201D (RIGHT DOUBLE QUOTE) with ASCII double quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam='&quot;' />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #8. ELLIPSIS, EM DASH, EN DASH, HTML ENTITIES -->
    <!-- Replace U+2026 (HORIZONTAL ELLIPSIS) with ASCII "..." -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2026;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="..." />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace literal &#151; (HTML entity for EM DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&amp;#151;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2014 (EM DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2014;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2013 (EN DASH) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2013;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace literal &zwnj; (ZERO WIDTH NON-JOINER entity) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&zwnj;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #9. BULLETS, MATH SYMBOLS, LETTERS WITH DIACRITICS -->
    <!-- Replace U+2022 (BULLET) with ASCII "*" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2022;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&#x002A;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+8722 (MATHEMATICAL MINUS variant) with ASCII "&" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x8722;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&amp;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+011F (LATIN SMALL G WITH BREVE) with ASCII "g" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x11f;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="g" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+00E1 (LATIN SMALL A WITH ACUTE) with ASCII "a" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xe1;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="a" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0161 (LATIN SMALL S WITH CARON) with ASCII "s" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x161;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="s" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+011B (LATIN SMALL E WITH CARON) with ASCII "e" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x11b;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="e" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #10. MISCELLANEOUS SYMBOLS -->
    <!-- Replace U+2713 (CHECK MARK) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2713;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace ASCII hyphen "-" with ASCII hyphen "-" (normalize) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace backtick with ASCII single quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="`" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Unicode Arrow (U+2192) with ASCII dash greaterthan -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2192;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="->" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace degree symbol with deg -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00B0;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="deg" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace copyright symbol U??? with (C) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00A9;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(C)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Trademark (U+2122) with (TM) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2122;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(TM)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace Registered (U+00AE) with (R) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00AE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="(R)" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />


    <!-- #11. INVISIBLE OPERATORS (remove) -->
    <!-- Replace U+00AD (SOFT HYPHEN) with "" (remove completely) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x00AD;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2061 (FUNCTION APPLICATION) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2061;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2062 (INVISIBLE TIMES) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2062;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2063 (INVISIBLE SEPARATOR) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2063;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2064 (INVISIBLE PLUS) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2064;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+180E (MONGOLIAN VOWEL SEPARATOR) with "" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x180E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- #12. LINE SEPARATORS -->
    <!-- Replace U+2028 (LINE SEPARATOR) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2028;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2029 (PARAGRAPH SEPARATOR) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2029;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+0085 (NEXT LINE / NEL) with ASCII newline -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0085;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+A78C (LATIN SMALL LETTER SALTILLO) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xA78C;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+FF07 (FULLWIDTH APOSTROPHE) with ASCII apostrophe "'" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#xFF07;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0335 (COMBINING SHORT STROKE OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0335;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0336 (COMBINING LONG STROKE OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0336;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0337 (COMBINING SHORT SOLIDUS OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0337;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Remove U+0338 (COMBINING LONG SOLIDUS OVERLAY) -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x0338;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2043 (HYPHEN BULLET) with ASCII "-" -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2043;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BE (MODIFIER LETTER RIGHT HALF RING) w/ ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BE;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+02BF (MODIFIER LETTER LEFT HALF RING) w/ ASCII apostrophe -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x02BF;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&apos;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201E (DOUBLE LOW-9 QUOTATION MARK) w/ ASCII double quote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+201F (DOUBLE HIGH-REVERSED-9 QUOTATION MARK) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x201F;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+275D (HEAVY DOUBLE QUOTATION MARK ORNAMENT LEFT) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x275D;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+275E (HEAVY DOUBLE QUOTATION MARK ORNAMENT RIGHT) w/ dquote -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x275E;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="&quot;" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2015 (HORIZONTAL BAR) with ASCII hyphen -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2015;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam="-" />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2009 (THIN SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2009;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- Replace U+2009 (THIN SPACE) with ASCII space -->
    <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1601" wParam="0" lParam="0" sParam="&#x2009;" />
    <Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
    <Action type="3" message="1602" wParam="0" lParam="0" sParam=" " />
    <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
    <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />

    <!-- END OF CONVERSION BLOCKS -->

    <!-- Begin Scintilla HTML-paste workaround bottom portion -->
    <!-- Select all cleaned text -->
    <Action type="0" message="2013" wParam="0" lParam="0" sParam="" />

    <!-- Cut cleaned text to Windows clipboard -->
    <Action type="0" message="2177" wParam="0" lParam="0" sParam="" />
    <!-- End Scintilla HTML-paste workaround bottom portion -->

    <!-- Close N++ -->
    <Action type="2" message="41001" wParam="0" lParam="0" sParam="" />

    <!-- N++ will save shortcuts.xml automatically as it
    rewrites the file whenever shortcuts/macros/plugins change.
    Some sections are required so empty sections will be recreated.


    </Macro>
    </Macros>
    <UserDefinedCommands>
    </UserDefinedCommands>

    <PluginCommands />
    <ScintillaKeys />

    </NotepadPlus>
    --
    One problem with Usenet is that there are people who can't ever contribute
    but who feel desperate to ruin the threads where people are contributing.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hank Rogers@Hank@nospam.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 17:59:26 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote on 2/13/2026 3:25 PM:
    On 2026-02-13 09:23, Daniel70 wrote:
    On 13/02/2026 6:49 am, Maria Sophia wrote:
    Doing my part to ignore insults from those who can never add value,
    but who feel desperate to post something (anything!) for some odd
    reason,
    the summary below explains (to the best of my knowledge) what happened.

    SORRY!!

    Quoting me ....
    I wish He/She/It WOULD keep His/Her identity SECRET by NOT changing it
    every six - twelve months to beat other peoples (ME) trying to filter
    those post to the big bit bucket!!
    End Quote

    Where did I insult you, .... all I did say was that I wished YOU
    wouldn't keep changing your NYM ... thereby routing other posters
    attempts to block you.

    Am I wrong?? Why DO you change YOUR Nym sooooo often??

    Look in comp.mobile.android for a post named "Happy New Year. It's
    January 1st. I'm changing the moniker on my accounts"

    I don't remember in which post he claimed that he shifts names in order
    to thwart robot searchers, to keep his privacy. Maybe it was some other
    post prior to this one.

    I'll abstain from expressing an opinion now.


    He is hopelessly psychotic, and will never recover. Sad.


    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hank Rogers@Hank@nospam.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 18:07:17 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote on 2/13/2026 3:25 PM:
    On 2026-02-13 09:23, Daniel70 wrote:
    On 13/02/2026 6:49 am, Maria Sophia wrote:
    Doing my part to ignore insults from those who can never add value,
    but who feel desperate to post something (anything!) for some odd
    reason,
    the summary below explains (to the best of my knowledge) what happened.

    SORRY!!

    Quoting me ....
    I wish He/She/It WOULD keep His/Her identity SECRET by NOT changing it
    every six - twelve months to beat other peoples (ME) trying to filter
    those post to the big bit bucket!!
    End Quote

    Where did I insult you, .... all I did say was that I wished YOU
    wouldn't keep changing your NYM ... thereby routing other posters
    attempts to block you.

    Am I wrong?? Why DO you change YOUR Nym sooooo often??

    Look in comp.mobile.android for a post named "Happy New Year. It's
    January 1st. I'm changing the moniker on my accounts"

    I don't remember in which post he claimed that he shifts names in order
    to thwart robot searchers, to keep his privacy. Maybe it was some other
    post prior to this one.

    I'll abstain from expressing an opinion now.


    Drugs and electroshock therapy could tone him down for a while. It
    would help us, but not him. He needs help, but there is none. Current medical knowledge can't help poor arlen.

    Since he is a savant, maybe he can teach us something before his demise.
    So far, though, its only gibberish.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 19:24:43 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote:
    I'll abstain from expressing an opinion now.

    Hi Carlos,

    *PLEASE STOP TROLLING THIS NEWSGROUP WITH YOUR CONSPIRACY THEORIES*

    It's obvious that Daniel70 is a troll so what you're doing is amplifying
    his trolls, when at least you added value that he's incapable of adding.

    Daniel70 has never posted anything of value in his entire life, Carlos.
    And you amplify his trolls?

    Why?

    This is off topic with respect to the PSA on HTML fragments so I'm going to
    ask you to stop incessantly trolling these newsgroups with untoward
    conspiracy theory fabrications that have absolutely no basis in fact.

    We've had this conversation ten thousand times and you just don't get it.
    Out of a million things about privacy, you likely know about 2 or 3.

    I must've posted, oh, I don't know, hundreds if not thousands of posts
    about privacy on Usenet over the decades, and yet you know none of it.

    Who, on earth, except me, organizes their iPad like this for example
    <https://i.postimg.cc/LXzB3Lc0/appleid01.jpg>

    Bear in mind that out of about a million things I know about privacy,
    most people know about 3 of those million things, and they do even fewer.

    I never hide from you and for you to intimate I do reeks of ....
    (I'm trying to be nice so I'm not going to say what it reeks of)

    Nobody who owns a working synapse could ever claim I'm "hiding" from you.
    <https://i.postimg.cc/sDWhsB18/editor-pic.jpg>

    Who, but me, writes entire tutorials on privacy, hundreds to thousands of
    them, e.g., for how to use a different web browser, one for each task.
    <https://i.postimg.cc/fT2J40RD/windows-cascade-menu.jpg>

    Who organizes a phone as brilliantly as I have done posting screenshots?
    <https://i.postimg.cc/xTDmWpt4/organization-phone-pc.jpg>

    Who says time and again, for decades, he lives in the same house with in
    the same Santa Cruz mountains with the same WISP with the same everything?
    <https://i.postimg.cc/hjjVXkq5/taskbarmenu07.jpg

    Same job. Same age. Same computers. Same home. Same kids. Same grandkids.
    *Same everything!*

    If you (or anyone else) can't figure out my posts after two decades of the
    same pictures, the same computer, the same location in Santa Cruz
    mountains, the same tutorials, the same writing style, the same perfect punctuation, the same excellent grammar, the same typos (most likely), the
    same phone, the same iPads, the same attitude toward privacy and
    efficiency, etc., then... um... er... well, you have no right to claim I'm 'hiding' from you when after a thousand posts like this, you figure it out.
    <https://i.postimg.cc/fW38dhsX/android-windows-menus.jpg>

    If I was "hiding" from you, would I repeatedly post the same unique
    screenshots of my Windows & Android & iOS gui where out of a million
    people, only 3 out of a million are as organized on a computer as I am.
    <https://i.postimg.cc/TPDd40Br/app-cleaner-uninstaller.jpg>

    I don't want to be "not nice" but anyone saying I'm "hiding from them", is, well, I can't think of a word that's nice so I'll just say they're ....
    --
    The people who deprecate privacy are always those who don't understand it.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 19:30:31 2026
    From Newsgroup: alt.comp.os.windows-10

    Hank Rogers wrote:
    He is hopelessly psychotic

    Hi Hank,

    Even you deserve respect so I'm going to ask you to summarize what you
    think is wrong in this thread that adults like Carlos, Andy Paul and I have posted about HTML fragments causing issues in certain tested situations.

    If you can't do that, meaning you can't add value to any adult topic, then
    I would like to ask why you're so desperate to fantasize that you can?

    Please answer at least one of the two adult questions asked of you above. Otherwise, I must ask you to stop infesting this ng with your trolls.
    --
    Bear in mind there are entire threads about Hank Roger's incessant trolls.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hank Rogers@Hank@nospam.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Fri Feb 13 18:33:35 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote on 2/13/2026 6:24 PM:

    For you, Arlen, EVERYONE is a troll. You are psychotic. I wish I could
    help you, but you need a close personal psychiatrist to guide you. Good
    luck.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Carlos E. R.@robin_listas@es.invalid to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sat Feb 14 02:29:30 2026
    From Newsgroup: alt.comp.os.windows-10

    On 2026-02-14 01:24, Maria Sophia wrote:
    Carlos E. R. wrote:
    I'll abstain from expressing an opinion now.

    Hi Carlos,

    -a*PLEASE STOP TROLLING THIS NEWSGROUP WITH YOUR CONSPIRACY THEORIES*

    If you start insulting me, I stop reading. What the heck are you talking about, conspiracy theories?

    PLONK.
    --
    Cheers,
    Carlos E.R.
    ESEfc-Efc+, EUEfc-Efc|;
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sat Feb 14 17:23:49 2026
    From Newsgroup: alt.comp.os.windows-10

    Carlos E. R. wrote:
    If you start insulting me, I stop reading. What the heck are you talking about, conspiracy theories?

    Hi Carlos,

    In this thread we succeeded in taking a rather detailed pernicious problem
    that may have been treated as mysterious without knowing the cause and we
    were able to pin down its actual cause by tracing the interaction between
    four different systems: Chromium, the Windows clipboard, Scintilla, and Notepad++.

    In doing so, we demonstrated that Chromium always places a CF_HTML fragment
    on the clipboard even when the user sees no formatting and that this
    fragment triggers Scintilla's "HTML paste mode," which silently disables
    Ctrl+A until the buffer is edited.

    We then discovered that Notepad++ does not execute macros directly from shortcuts.xml, but instead loads them into an internal array and continues using the last valid version unless the macro is explicitly deleted and rebuilt.

    Once we understood that behavior, we created a macro that reliably breaks
    HTML fragment mode, normalizes Unicode, removes zero-width characters,
    cleans up punctuation, and returns a pure ASCII result to the clipboard.

    In short, together we reverse-engineered a subtle, multi-layered bug,
    explained why Firefox doesn't trigger it, and produced a working,
    reproducible fix that nobody else in the thread had the technical depth or persistence to uncover and explain alone, least of all me.

    That's the technical aspect we covered in this thread, but then there's the emotional aspect which you seem to be reacting to, and, truth be told, I shouldn't have reacted to the trolls by Hank Rogers & Daniel70, especially
    as neither showed any understanding of the topic and yet both felt
    desperate to butt into the conversation for reasons known only to them.

    I apologize if you took my suggestions for you to not amplify their trolls
    in a way that I didn't mean it as I feel you misunderstood what I meant,
    and that's fine as tone is hard to read on Usenet but nothing I said was intended as an insult.

    When I referred to "conspiracy theories," I was talking specifically about
    the repeated claim by the trolls who are desperate to say something,
    anything! even as they don't understand any topic we've discussed, that I
    am "hiding," "evading," or "changing identities to avoid people."

    I explained in detail why that claim has no basis in fact, and I've
    explained the privacy reasons behind my header-naming conventions many
    times over the years.

    You're free to disagree with my privacy choices, but attributing motives to
    me that aren't true is exactly the kind of misunderstanding I was trying to clear up. That's all.

    If you don't want to continue the TECHNICAL discussion, that's your choice. But, in all seriousness, I think I've solved all the difficult problems.

    Since I covered the topic at the level of a published paper, there's really
    not much more left to discuss, as far as I can tell. It's all fixed now.

    I'm simply clarifying the record so the technical thread doesn't get
    derailed by conspiracy-theory assumptions about my clear intentions.
    --
    Often those who most deprecate privacy are those who least understand it.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sun Feb 15 06:20:59 2026
    From Newsgroup: alt.comp.os.windows-10

    To keep tests together, here is a post from Lawrence that shows his tests
    on Linux, similar to those Carlos tried to run, match this PSA exactly!


    From: Lawrence D'Oliveiro <ldo@nz.invalid>
    Newsgroups: alt.comp.software.firefox,comp.sys.mac.system,alt.os.linux
    Subject: Re: PSA: Clipboard differences between Chromium & Firefox across platforms
    Date: Fri, 13 Feb 2026 23:17:22 -0000 (UTC)
    Message-ID: <10mobe2$2nt13$1@dont-email.me>

    On Thu, 12 Feb 2026 15:26:32 -0500, Maria Sophia wrote:

    The HTML Fragment issue showed up for me on Windows when pasting
    into Notepad++ from Chromium, but the underlying cause applies to
    all platforms because it apparently comes from how Chromium
    generates clipboard data,, which is DIFFERENT from how Firefox does
    the same copy/paste tasks.

    Even as Firefox uses a different and more conservative clipboard
    path, the problem seemed almost random because the editor is
    intimately involved.

    Do you have a tool for inspecting the clipboard contents? In
    particular, listing the different formats in which the clipboard
    contents are being offered? That might shed more light on what exactly
    is going on.

    Here are some examples from my Linux system.

    * Copying some text from Emacs:

    ldo@theon:~> wl-paste -l
    GTK_TEXT_BUFFER_CONTENTS
    application/x-gtk-text-buffer-rich-text
    text/plain;charset=utf-8
    UTF8_STRING
    COMPOUND_TEXT
    TEXT
    text/plain
    STRING
    text/plain;charset=utf-8
    text/plain
    SAVE_TARGETS

    * Copying some text from a web page in Firefox:

    ldo@theon:~> wl-paste -l
    text/html
    text/_moz_htmlcontext
    text/_moz_htmlinfo
    text/plain;charset=utf-8
    UTF8_STRING
    COMPOUND_TEXT
    TEXT
    text/plain
    STRING
    text/plain;charset=utf-8
    text/plain
    text/x-moz-url-priv
    SAVE_TARGETS

    * Copying the same text from the same web page in Chromium:

    ldo@theon:~> wl-paste -l
    chromium/x-source-url
    text/html
    STRING
    TEXT
    UTF8_STRING
    text/plain
    text/plain;charset=utf-8
    chromium/x-internal-source-rfh-token
    text/plain;charset=utf-8

    * Copying the above text from KDE Konsole:

    ldo@theon:~> wl-paste -l
    text/plain
    text/html
    text/plain;charset=utf-8
    --
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sun Feb 15 01:27:16 2026
    From Newsgroup: alt.comp.os.windows-10

    The knowledgeable folks on the Linux newsgroup ran tests confirming this
    PSA exactly (see Lawrence D'Oliveiro's tests I forwarded moments ago).

    Since those on these Windows newsgroups might not be familiar with Linux,
    here is just my take on the confirmation that Lawrence kindly provided.

    Lawrence DoOliveiro wrote:
    Do you have a tool for inspecting the clipboard contents? In
    particular, listing the different formats in which the clipboard
    contents are being offered? That might shed more light on what exactly
    is going on.

    Here are some examples from my Linux system.

    * Copying some text from Emacs:

    ldo@theon:~> wl-paste -l
    GTK_TEXT_BUFFER_CONTENTS
    application/x-gtk-text-buffer-rich-text
    text/plain;charset=utf-8
    UTF8_STRING
    COMPOUND_TEXT
    TEXT
    text/plain
    STRING
    text/plain;charset=utf-8
    text/plain
    SAVE_TARGETS

    * Copying some text from a web page in Firefox:

    ldo@theon:~> wl-paste -l
    text/html
    text/_moz_htmlcontext
    text/_moz_htmlinfo
    text/plain;charset=utf-8
    UTF8_STRING
    COMPOUND_TEXT
    TEXT
    text/plain
    STRING
    text/plain;charset=utf-8
    text/plain
    text/x-moz-url-priv
    SAVE_TARGETS

    * Copying the same text from the same web page in Chromium:

    ldo@theon:~> wl-paste -l
    chromium/x-source-url
    text/html
    STRING
    TEXT
    UTF8_STRING
    text/plain
    text/plain;charset=utf-8
    chromium/x-internal-source-rfh-token
    text/plain;charset=utf-8

    * Copying the above text from KDE Konsole:

    ldo@theon:~> wl-paste -l
    text/plain
    text/html
    text/plain;charset=utf-8

    Hi Lawrence,

    Wow. That was excellent detective work you just did for the team!
    Thanks for taking the time to run all those Linux-based wl-paste tests.

    Of all who posted, only you, Carlos, and I have been testing this and
    reporting back what we've found, all of which confirms the basic premise.

    Each section of your output helps confirm the pattern I was trying to
    describe in the PSA and your suggestion for me to do the same resonated.

    Your Emacs example shows a plain-text-oriented application offering a
    large set of classic X11 and GTK text targets. That is the baseline
    case, nothing surprising there.

    Your Firefox example shows exactly what we had expected. Firefox adds the _moz_htmlcontext and _moz_htmlinfo formats only when it believes the
    selection contains meaningful structure. That matches what I intuitively
    see on Windows, where Firefox emits HTML Format only when needed.

    But I didn't have a clipboard inspector until I looked one up just now.
    <https://www.nirsoft.net/utils/inside_clipboard.html>
    <https://www.nirsoft.net/utils/insideclipboard.zip>
    Name: insideclipboard.zip
    Size: 42653 bytes (41 KiB)
    SHA256: 13E71984F63C0C50E7710B92505D0B5BF422CA5214B61EAF51E02CB8A4B63B7E

    Name: InsideClipboard.exe
    Size: 37376 bytes (36 KiB)
    SHA256: 89C7BF5136E5BC1572325197C97FEA33FBC9F106B04AA924BEB58B5A687D2DF7

    InsideClipboard v1.30
    This utility works on any version of Windows, from Win XP to Win 11.
    "Each time that you copy something into the clipboard for pasting
    it into another application, the copied data is saved into multiple
    formats. The main clipboard application of Windows only display the
    basic clipboard formats, like text and bitmaps, but doesn't display
    the list of all formats that are stored in the clipboard.
    InsideClipboard is a small utility that displays the binary content
    of all formats that are currently stored in the clipboard, and allow
    you to save the content of specific format into a binary file."

    The pot of gold on the other end of the rainbow though was your Chromium example results where Chromium always emits text/html plus its own internal chromium/x-* formats, even when the selection looks like plain text.

    That is the same behavior I ran into on Windows, where the HTML Fragment
    block is always present and can influence how the plain text stream is
    parsed by editors that do not expect it.

    Your Konsole example shows a terminal that offers plain text first but
    still includes text/html. That reinforces the assumption on my part that
    the editor or application decides which format to request, and that the presence of HTML can change how the paste is interpreted.

    All of your Linux results line up with what I saw on Windows. The
    difference is not the platform, it is the clipboard formats that
    Chromium places on the clipboard (which differ from what FF places).

    I will try the NirSoft InsideClipboard tool tomorrow so I can see the
    Windows formats directly, the same way your wl-paste -l output shows
    them on Linux. If you have a simple test procedure you want me to run
    with InsideClipboard, let me know and I will follow it exactly.

    Otherwise, off the cuff, what I think I may try is this procedure
    which keeps the variables controlled so we can compare results.

    1. Open a plain-looking web page in Firefox.
    2. Select a short block of visible text, no images.
    3. Press Ctrl+C.
    4. Open InsideClipboard and note every format listed.
    5. Save the list or take a screenshot for reference.

    6. Repeat the same steps in Chromium:
    A. Same page.
    B. Same text selection.
    C. Press Ctrl+C.
    D. Open NirSoft InsideClipboard and note the formats.

    7. Compare the two lists. The key question is whether Chromium
    always includes HTML Format and related offsets even when the
    selection looks like plain text, and whether Firefox omits
    HTML Format when it decides the selection has no structure.

    8. If I want a third comparison, I can paste the same selection into
    Notepad++ and then check InsideClipboard again. I think the formats
    will disappear after the paste, which would confirm that the metadata
    never enters the file buffer (which is why Notepad++ hex editor never
    saw anything).

    If I follow these steps tomorrow, maybe can line up the Windows results
    with the wl-paste output you showed on Linux.

    Thanks again for running the detailed tests which prove what we've
    experienced the hard way, and for suggesting I see a clipboard inspector.

    Do you think this test will suffice to sync up with your efforts?
    --
    There are two kinds of people on Usenet, one of which can add value.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sun Feb 15 20:56:02 2026
    From Newsgroup: alt.comp.os.windows-10

    Voila! Proof!

    This was posted to the Firefox newsgroup just moments ago, but it also
    belongs to the Windows newsgroup because the debug below is on Windows.

    It proves the "problem" which proves the PSA correct.
    But even better, it explains the "solution" which is what is needed.

    Note the "format" is a numeric identifier that Windows uses internally to
    label a clipboard format.
    Format ID 1 = CF_TEXT
    Format ID 7 = CF_OEMTEXT
    Format ID 13 = CF_UNICODETEXT
    Format ID 16 = CF_LOCALE
    These are built in Windows formats.
    They exist on every Windows system.

    Yet the important ones in this test are:
    Format ID 49426 = HTML Format
    Format ID 49661 = Chromium internal source RFH token
    Format ID 49683 = Chromium internal source URL

    For each of those, Chromium told Windows (taking one as an example):
    "I want to register a clipboard format named HTML Format"
    and Windows assigned it ID 49426.

    Why are the numbers so big?
    Because Windows built-in formats use small numbers:
    1, 7, 13, 16, etc.
    While application-registered formats use large numbers:
    49426, 49661, 49683.

    These indicate that Chromium placed multiple formats on the clipboard
    HTML Format
    Chromium internal source RFH token
    Chromium internal source URL

    But when your Notepad++ macro rewrote the clipboard, all of those formats disappeared.
    A. This is why Ctrl+A started working again.
    B. The shortcuts.xml CTRL+B macro removed the HTML Fragment land mine

    Voila!
    I love one-step automation!

    Lawrence 'Oliveiro wrote:
    On Sun, 15 Feb 2026 07:54:11 -0500, Paul wrote:

    There's no guarantee any clipboards softwares will even agree on
    what is on the clipboard. Some of the clipboard items could be
    automatic translations of things submitted by the sourcing
    application.

    If the source application supplies a format for the Clipboard, it
    makes sense for any app examining the Clipboard to see that format.

    And Wayland being a latecomer, of course it's going to have to do
    weird shit, to get a name for itself.

    Feel free to reproduce my tests, in a suitable way, on a pure-X11, Wayland-free system, then, just to see what "weird shit" means.

    Hi Lawrence,

    I, for one, greatly admire and appreciate what Lawrence has done for the
    team, especially as Apple, Linux and Firefox users are on this thread.

    I ran NirSoft InsideClipboard on Windows today and saw the same pattern
    that Lawrence showed on Linux. Chromium placed CF_TEXT, CF_UNICODETEXT, CF_LOCALE, HTML Format, and two Chromium internal formats on the
    clipboard. That matches his Linux results one for one, just expressed in Windows clipboard terminology instead of MIME types

    ==================================================
    Format ID : 1
    Format Name : CF_TEXT
    Handle Type : Memory
    Size : 244
    Index : 6
    ==================================================

    ==================================================
    Format ID : 7
    Format Name : CF_OEMTEXT
    Handle Type : Memory
    Size : 244
    Index : 7
    ==================================================

    ==================================================
    Format ID : 13
    Format Name : CF_UNICODETEXT
    Handle Type : Memory
    Size : 488
    Index : 2
    ==================================================

    ==================================================
    Format ID : 16
    Format Name : CF_LOCALE
    Handle Type : Memory
    Size : 4
    Index : 5
    ==================================================

    ==================================================
    Format ID : 49426
    Format Name : HTML Format
    Handle Type : Memory
    Size : 1,204
    Index : 1
    ==================================================

    ==================================================
    Format ID : 49661
    Format Name : Chromium internal source RFH token
    Handle Type : Memory
    Size : 24
    Index : 3
    ==================================================

    ==================================================
    Format ID : 49683
    Format Name : Chromium internal source URL
    Handle Type : Memory
    Size : 58
    Index : 4
    ==================================================

    What Lawrence didn't do, since he wasn't cleaning text and I am, is run it through the Notepad++ macro, which puts things back on the clipboard.

    This is the NirSoft InsideClipboard result after running Control+B
    (which adds a space, deletes it, cleans the text, & repopulates the
    clipboard with the cleaned text and wipes it off of N++ in 1 step).

    ==================================================
    Format ID : 1
    Format Name : CF_TEXT
    Handle Type : Memory
    Size : 330
    Index : 3
    ==================================================

    ==================================================
    Format ID : 7
    Format Name : CF_OEMTEXT
    Handle Type : Memory
    Size : 330
    Index : 4
    ==================================================

    ==================================================
    Format ID : 13
    Format Name : CF_UNICODETEXT
    Handle Type : Memory
    Size : 660
    Index : 1
    ==================================================

    ==================================================
    Format ID : 16
    Format Name : CF_LOCALE
    Handle Type : Memory
    Size : 4
    Index : 2
    ==================================================

    I have no idea, just yet, what that means in detail, but it seems to show
    a. Chromium always puts plain text formats on the clipboard.
    b. Chromium always puts HTML Format on the clipboard.
    c. Chromium always puts its internal metadata formats on the clipboard.
    CF_TEXT
    CF_OEMTEXT
    CF_UNICODETEXT
    CF_LOCALE
    HTML Format
    Chromium internal source RFH token
    Chromium internal source URL

    This appears to match Lawrence's Linux results one for one.
    Hence, I believe it confirms the Chromium part of the PSA.

    While the Notepad++ macro is really part of a separate Windows thread, we
    note here that after the CTRL+B macro ran, the clipboard contained only:
    CF_TEXT
    CF_OEMTEXT
    CF_UNICODETEXT
    CF_LOCALE

    The control+b macro stripped out all non text formats, including:
    HTML Format
    Chromium internal formats
    This means the shortcuts.xml macro is not only cleaning the text.
    It is also replacing the clipboard contents with a new, plain-text-only clipboard (since everything we do is always tuned to a single step).

    That new clipboard no longer contains the HTML Fragment land mine.
    This explains how and why the Notepad++ macro fixes the problem.

    In summary,
    a. Chromium puts HTML Format on the clipboard.
    b. Notepad++ sees that HTML Format exists.
    c. Even though Notepad++ requests plain text, the presence of
    that HTML Format changes how the paste is interpreted.
    d. Which breaks Ctrl+A.

    The macro wipes the clipboard and replaces it with plain text only.
    Once the HTML Format is gone, Ctrl+A works again. Voila!

    The macro fixes the problem because it removes the HTML Format block by rewriting the clipboard, which is exactly the invisible land mine described
    in the PSA's original post.

    Without everyone's help, particularly that of Lawrence, Paul & Carlos, I
    never would have gotten this far in testing and explaining how it works.
    --
    My brain is wired for Occam's Razor, but to be always 100% logically
    correct, I need to have as many facts as I can to fit into the picture.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sun Feb 15 21:27:35 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:
    Without everyone's help, particularly that of Lawrence, Paul & Carlos, I never would have gotten this far in testing and explaining how it works.

    This reminds me, in a small way, of how I felt when reading Einstein's 1916 book (later revised in the early 1920's, which lost copyright 100 years
    later) in that every revelation reveals a new mystery to resolve next.

    Keeping in mind the whole thing started when I pasted Chromium text into Notepad++ which caused Control+A to die, this is a short explanation.

    Windows assigns every clipboard format a numeric ID, e.g., 1, 7, 13, 16.
    CF_TEXT is the old ANSI text format.
    CF_OEMTEXT is the old OEM codepage text format.
    CF_UNICODETEXT is the modern Unicode text format.
    CF_LOCALE tells Windows what language or locale the text came from
    etc.

    While NirSoft InsideClipboard shows those IDs, it turns out that Chromium registers its own formats by name, so Windows assigns those names whatever available (usually large) numbers it has, such as 49426 or 49683.

    Notepad++ does not use most of those CF clipboard formats directly.
    Notepad++ almost always just asks Windows only for CF_UNICODETEXT.

    The important detail is that Windows uses the HTML Format entry to generate
    the plain text that Notepad++ receives. That conversion step is where the invisible CTRL+A land mine comes from.

    That means the presence of HTML Format changes how the plain
    text is produced, even though Notepad++ never reads the HTML itself.

    When the control+B shortcuts.xml macro rewrites the clipboard, it removes
    HTML Format and all Chromium internal formats.

    With only plain text formats left, Windows no longer has to convert from
    HTML, so the plain text becomes clean and Ctrl+A works again.

    But what exactly is causing Control+A to stop working in Notepad++?
    The reason Ctrl+A dies is not that HTML is pasted into the file.

    The problem actually happens earlier, inside Windows, when Windows converts
    the HTML Format entry into CF_UNICODETEXT for Notepad++.

    When Chromium puts HTML Format on the clipboard, Windows must run its HTML-to-text converter. That converter uses the StartHTML, EndHTML, StartFragment, and EndFragment offsets inside the HTML Fragment block.

    If those offsets are wrong, or if the HTML fragment is malformed, the
    converter can produce a CF_UNICODETEXT stream with hidden control
    characters, mismatched boundaries, or an unexpected buffer length.

    Notepad++ receives that CF_UNICODETEXT stream and loads it into its
    internal Scintilla buffer. If the buffer contains an unexpected control sequence or a broken length field, Scintilla can fail to compute the
    full document range.

    Bingo!

    When that happens, Ctrl+A does not select the whole buffer because
    Scintilla thinks the document ends earlier than it actually does.

    The Control+B macro fixes the issue because it wipes the clipboard and
    replaces it with plain text only (among other things that it does).

    With no HTML Format present, Windows does not run the HTML-to-text
    converter again, so the CF_UNICODETEXT stream is finkally clean
    and Scintilla can compute the correct document length.

    Once the buffer is clean, Ctrl+A works again.
    Whew!

    Given this took me hours to debug & resolve, the whole point of this PSA is
    to help the next person not have to do all the work that I just had to do!
    --
    "Everything should be made as simple as possible, but not simpler."
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to alt.comp.os.windows-10,alt.comp.os.windows-11,alt.comp.microsoft.windows on Sun Feb 15 21:57:21 2026
    From Newsgroup: alt.comp.os.windows-10

    Maria Sophia wrote:
    Since I covered the topic at the level of a published paper, there's really not much more left to discuss, as far as I can tell. It's all fixed now.

    All that's left after fixing the issue was understanding what actually
    went wrong in the first place (which killed the control+A in Notepad++).

    It turns out that Windows does not convert the HTML Format entry into text until an application explicitly asks for a text format.

    So the corruption happens at the moment Notepad++ requests CF_UNICODETEXT.

    The sequence (as far as I can re-construct it) is...

    1. With Ctrl+C, Chromium places several formats on the clipboard,
    including HTML Format, CF_UNICODETEXT, and its internal metadata.

    2. With Ctrl+V, Notepad++ asks Windows:
    "Give me CF_UNICODETEXT."

    3. Windows sees that HTML Format is available and may choose to generate
    the CF_UNICODETEXT stream by converting the HTML fragment.

    Kaboom!

    4. That conversion step can produce a corrupted CF_UNICODETEXT stream.
    The corruption is not visible text. Which is why I couldn't "see" it.
    It is a bad length field or a hidden control character (apparently).

    Is that a bug?
    I don't know.

    5. Scintilla loads that corrupted stream into its internal buffer.
    But the buffer boundaries are now wrong, so Ctrl+A fails because
    Scintilla thinks the document ends earlier than it actually does.

    So why didn't I see it in the Notepad++ hex editor?

    The HTML is never pasted into the file, so it can't be seen.
    But it affects the text Windows hands to Notepad++ at paste time.

    Well then, why does adding and deleting a character fix it?

    Because the corruption lives only in Scintilla's internal buffer
    structures, not in the visible text. When the macro inserts a space,
    Scintilla is forced to rebuild its entire buffer. That rebuild wipes out
    the corrupted boundary. Removing the space forces a second rebuild,
    which simply restores the original content. The second rebuild is not
    needed for the fix; it is only needed to undo the temporary change.

    After that, the macro selects all and cuts the text. Cutting forces
    Windows to create a brand new clipboard entry. This new clipboard entry contains only plain text formats, because Scintilla does not generate
    HTML Format or any Chromium internal formats.

    I don't know if this is a bug or not, as all I know, in the end, is...
    1. The corrupted CF_UNICODETEXT stream from the original paste is gone.
    2. The clipboard now contains only clean plain text.
    3. Scintilla now has a clean buffer with correct boundaries.
    4. Ctrl+A works again.

    Woo hoo!

    So the fix works because Windows created the problem when converting the
    HTML fragment into text, which corrupted Scintilla's internal buffer.
    Adding and removing a character forces Scintilla to rebuild its buffer,
    and cutting the text forces Windows to rebuild the clipboard without
    HTML Format. The corruption cannot survive those two rebuilds.

    I think we explained it as simply as we could, but not simpler.
    --
    How wonderful that we have met with a paradox.
    Now we have some hope of making progress.
    --- Synchronet 3.21b-Linux NewsLink 1.2