• SetWindowsHook WH_CBT HCBT_CREATEWND CREATESTRUCT - making changes?

    From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 12 09:23:18 2025
    From Newsgroup: alt.windows7.general

    Hello all,

    I'm using SetWindowsHook around a simple messageBox, and would like to
    change both the text as well as the Dialog-IDs of the buttons.

    I'm getting HCBT_CREATEWND messages, which point to CREATESTRUCT structures. The thing is that when I change the psName and/or hMenu (DlgID) fields they get ignored (overwritten).

    MSDN webpages does not seem to contain any information about the above.

    Questions:

    Is there information available about which fields in the CREATESTRUCT structure may be changed ?

    Is there perhaps something special I need to do to get some of the changes accepted ?


    Remark: I have found an example which does all the changes when the HCBT_ACTIVATE message is received. iow, I can already make the changes. I'm looking for information.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Thu Nov 13 12:34:41 2025
    From Newsgroup: alt.windows7.general

    On Wed, 12 Nov 2025 09:23:18 +0100, R.Wieser wrote:

    Hello all,

    I'm using SetWindowsHook around a simple messageBox, and would like to change both the text as well as the Dialog-IDs of the buttons.

    I'm getting HCBT_CREATEWND messages, which point to CREATESTRUCT structures. The thing is that when I change the psName and/or hMenu (DlgID) fields they get ignored (overwritten).

    MSDN webpages does not seem to contain any information about the above.

    Questions:

    Is there information available about which fields in the CREATESTRUCT structure may be changed ?

    Is there perhaps something special I need to do to get some of the changes accepted ?

    Remark: I have found an example which does all the changes when the HCBT_ACTIVATE message is received. iow, I can already make the changes. I'm looking for information.

    Regards,
    Rudy Wieser

    Read the `HCBT_CREATEWND` description carefully.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Thu Nov 13 07:51:40 2025
    From Newsgroup: alt.windows7.general

    JJ,

    Read the `HCBT_CREATEWND` description carefully.

    Where and what part please ?

    If you mean the parts about it on the MSDN "CBTProc callback function" webpage, all it does is mentioning a few fields while saying nothing about
    the others.

    Thats the whole problem.

    Also, there is no logic in those mentioned fields being changable*, but none(?) of the others. And no explanation is provided for it. :-(

    * I can change both the buttons texts and Dialog-IDs** in the HCBT_ACTIVATE callback, so why not in HCBT_CREATEWND ?

    ** a nice gotya : when the "X" close button is not grayed out it will still refuse to work if there is no button with the IDCANCEL dialog-ID present.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Fri Nov 14 12:09:41 2025
    From Newsgroup: alt.windows7.general

    On Thu, 13 Nov 2025 07:51:40 +0100, R.Wieser wrote:

    JJ,

    Read the `HCBT_CREATEWND` description carefully.

    Where and what part please ?

    If you mean the parts about it on the MSDN "CBTProc callback function" webpage, all it does is mentioning a few fields while saying nothing about the others.

    Thats the whole problem.

    Also, there is no logic in those mentioned fields being changable*, but none(?) of the others. And no explanation is provided for it. :-(

    * I can change both the buttons texts and Dialog-IDs** in the HCBT_ACTIVATE callback, so why not in HCBT_CREATEWND ?

    ** a nice gotya : when the "X" close button is not grayed out it will still refuse to work if there is no button with the IDCANCEL dialog-ID present.

    Regards,
    Rudy Wieser

    [quote]
    At the time of the HCBT_CREATEWND notification, the window has been created, but its final size and position may not have been determined and its parent window may not have been established.
    [/quote]
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Fri Nov 14 08:01:41 2025
    From Newsgroup: alt.windows7.general

    JJ,

    [quote]
    At the time of the HCBT_CREATEWND notification, the window has been
    created, but its final size and position may not have been determined
    and its parent window may not have been established.
    [/quote]

    Yes, I read that. Whats your point ?

    And for the record : I can, in the HCBT_CREATEWND event, change both the Dialog-ID and name of the control using SetWindowLong GWL_ID and
    SetWindowText calls (verified by reading both back just after setting them), but they get overwritten - just not with whats in CREATESTRUCT.

    And thats the whole thing. Which fields of CREATESTRUCT are (effectivily) Read-only, and which ones carry over to the finished control ?

    If the coordinates, size and z-order are the /only/ ones than why isn't that mentioned ?

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 15 11:37:53 2025
    From Newsgroup: alt.windows7.general

    On Fri, 14 Nov 2025 08:01:41 +0100, R.Wieser wrote:
    JJ,

    [quote]
    At the time of the HCBT_CREATEWND notification, the window has been
    created, but its final size and position may not have been determined
    and its parent window may not have been established.
    [/quote]

    Yes, I read that. Whats your point ?

    It's right there in plain sight:

    "At the time of the HCBT_CREATEWND notification, the window has been
    created"

    So, anything which you changed in the CREATESTRUCT structure, won't do anything.

    And for the record : I can, in the HCBT_CREATEWND event, change both the Dialog-ID and name of the control using SetWindowLong GWL_ID and SetWindowText calls (verified by reading both back just after setting them), but they get overwritten - just not with whats in CREATESTRUCT.

    That's is entirely different than changing the content of the CREATESTRUCT structure.

    And thats the whole thing. Which fields of CREATESTRUCT are (effectivily) Read-only, and which ones carry over to the finished control ?

    All of them are technically not read-only. They're just for notification purpose. They're not for controlling/instructing anything.

    If the coordinates, size and z-order are the /only/ ones than why isn't that mentioned ?

    At userland, both top-level and control window creation use the same
    function, and that function is affected by STARTUPINFO/EX.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 15 08:50:49 2025
    From Newsgroup: alt.windows7.general

    JJ,

    It's right there in plain sight:

    "At the time of the HCBT_CREATEWND notification, the window has been
    created"

    So, anything which you changed in the CREATESTRUCT structure, won't do anything.

    It also says

    "Contains information passed to a WH_CBT hook procedure, CBTProc, *before a window is created*."

    and

    "A pointer to a CREATESTRUCT structure that contains initialization
    parameters for the window *about to be created*."

    and

    "It is possible to send messages to *the newly created window*,"

    So, we have a Schr%dinger's cat window which, in the HCBT_CREATEWND event,
    is at the same time both created and not created ?

    Yeah, that really gives me confidence about the correctness/completeness of the rest of the provided information. :-(


    Also,

    "Specifies a long pointer to a CBT_CREATEWND structure containing *initialization parameters for the window*."

    Which is *at best* vague. At face value it seems to indicate that *all* of those parameters are used.

    Which is followed by

    "The parameters *include* the coordinates and dimensions of the window."

    And pardon me, but as far as I know "include" doesn't translate to "with the exclusion of everything else" in *any* language.


    Again : do you perhaps have any information, MSDN or otherwise, about which fields in that CREATESTRUCT may be changed (are carried on to the resulting window) ?

    If that information is not available than I will just determine them empirically.


    And by the way: I just tried to change the Style and ExStyle fields, but
    those also get ignored.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sun Nov 16 14:16:21 2025
    From Newsgroup: alt.windows7.general

    On Sat, 15 Nov 2025 08:50:49 +0100, R.Wieser wrote:

    It also says

    "Contains information passed to a WH_CBT hook procedure, CBTProc, *before a window is created*."

    and

    "A pointer to a CREATESTRUCT structure that contains initialization parameters for the window *about to be created*."

    and

    "It is possible to send messages to *the newly created window*,"

    So, we have a Schr0|A0+<ger's cat window which, in the HCBT_CREATEWND event, is at the same time both created and not created ?

    Yeah, that really gives me confidence about the correctness/completeness of the rest of the provided information. :-(

    Also,

    "Specifies a long pointer to a CBT_CREATEWND structure containing *initialization parameters for the window*."

    Which is *at best* vague. At face value it seems to indicate that *all* of those parameters are used.

    Which is followed by

    "The parameters *include* the coordinates and dimensions of the window."

    And pardon me, but as far as I know "include" doesn't translate to "with the exclusion of everything else" in *any* language.

    Again : do you perhaps have any information, MSDN or otherwise, about which fields in that CREATESTRUCT may be changed (are carried on to the resulting window) ?

    If that information is not available than I will just determine them empirically.

    And by the way: I just tried to change the Style and ExStyle fields, but those also get ignored.

    Regards,
    Rudy Wieser

    It's documentation problem, that I agree. One of many. Probably because it
    was made by multiple people, instead of one.

    Humans make mistakes, but code don't. The truth lies in the actual test
    result itself.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sun Nov 16 11:21:09 2025
    From Newsgroup: alt.windows7.general

    JJ,

    It's documentation problem, that I agree. One of many.

    Indeed.

    And because I could not be certain about what was actually said/ment I
    thought it to be prudent to ask.

    I could also easily imagine that I missed something on the found webpages themselves, or that I just didn't find the webpages containing the
    explanation I was looking for.

    The truth lies in the actual test result itself.

    I've also been told that I should not depend on empirical results, as they
    are not official and thus could change at any moment. Which is the other reason I asked.

    Thanks for the help. If nothing else, it pushed me to do some more "what happens when I change/do /this/" probing. :-)

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From J. P. Gilliver@G6JPG@255soft.uk to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sun Nov 16 15:27:26 2025
    From Newsgroup: alt.windows7.general

    On 2025/11/16 7:16:21, JJ wrote:
    On Sat, 15 Nov 2025 08:50:49 +0100, R.Wieser wrote:
    So, we have a SchrN++N++N++N++N++N++ger's cat window which, in the HCBT_CREATEWND event,
    []
    In RW's preceding post, I saw that as "Schr||dinger's" - with an o
    umlaut, and the rest ordinary ASCII characters.
    In JJ's post, and in my compose window as I'm typing this, I see Schr######ger's, except instead of #, a vertical diamond with a light
    question mark on it.
    I'm used to characters falling foul of encodings and showing up as odd characters; I'm also used to characters sometimes turning up as two or
    more characters.
    What's puzzling me is why (a) the "din" characters got converted, (b)
    why there are _six_ of the substitute character. I _suppose_ the latter
    could be the || (o umlaut) being converted into _three_ characters
    followed by the other characters, but still ...
    (Using here Thunderbird "140.4.0esr (64-bit)" under windows 10-64.)--
    J. P. Gilliver. UMRA: 1960/<1985 MB++G()ALIS-Ch++(p)Ar++T+H+Sh0!:`)DNAf
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sun Nov 16 21:51:42 2025
    From Newsgroup: alt.windows7.general

    J. P. ,

    What's puzzling me is why (a) the "din" characters got converted,

    Thats easy : the o-umlaut character got recognised as the start of a multi-byte sequence.

    Ofcourse, the sequence-conversion /should/ have aborted as the next
    character is outside the range of permitted secondary or ending multi-byte characters.

    (b) why there are _six_ of the substitute character.

    Where JJ quoted me I see two "I have no idea what this multi-byte character is" rectangles, each consisting outof three bytes.

    The first question is how a single multi-byte leading byte has caused *two* multi-byte character sequences to be emitted. That should never have happened.

    The next question is why the two multi-byte character sequences are, in your editor, not displayed as two unknown multi-byte characters, but as six seperate bytes, all unknown.

    My guess is that that is your own editor which tries to be smart and tried
    to interpret the message as ASCII, noticed that there are a number of bytes
    in the message that fall outside of that range. Or used a font with only
    the ASCII range defined ofcourse.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Mon Nov 17 14:14:33 2025
    From Newsgroup: alt.windows7.general

    On Sun, 16 Nov 2025 21:51:42 +0100, R.Wieser wrote:

    The first question is how a single multi-byte leading byte has caused *two* multi-byte character sequences to be emitted. That should never have happened.

    I think that would be due to a bug in MSOE. Can you post the same word again along with different words with the o-umlaut character from your MSOE?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Mon Nov 17 10:01:46 2025
    From Newsgroup: alt.windows7.general

    JJ,

    The first question is how a single multi-byte leading byte has
    caused *two* multi-byte character sequences to be emitted.
    That should never have happened.

    I think that would be due to a bug in MSOE.

    You're suspecting my newsgroup reader ? I already checked the message I posted and the one in the newsgroup, and those just have a single 0xF6 character in it.

    Can you check at your end what my message looks like (save message to disk, look at it with a hex editor) ? If it there also contains a single 0xF6
    char ...

    Can you post the same word again along with different words
    with the o-umlaut character from your MSOE?

    Your wish is my command :-)

    text%abcdef
    text%bcdefg
    text%cdefgh
    text%defghi
    text%efghij
    text%fghijk
    text%ghijkl

    Remark: I just saved the to-be-send message to disk and looked at it with an hex editor, and all I see is one 0xF6 char per above line. iow, no pre-post conversions that I can see.

    Regards,
    Rudy Wieser



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Tue Nov 18 21:18:49 2025
    From Newsgroup: alt.windows7.general

    On Mon, 17 Nov 2025 10:01:46 +0100, R.Wieser wrote:

    You're suspecting my newsgroup reader ? I already checked the message I posted and the one in the newsgroup, and those just have a single 0xF6 character in it.

    Can you check at your end what my message looks like (save message to disk, look at it with a hex editor) ? If it there also contains a single 0xF6 char ...

    Can you post the same word again along with different words
    with the o-umlaut character from your MSOE?

    Your wish is my command :-)

    text0|e0#udef
    text0|a0#nefg
    text0|e0|Nfgh
    text0|A0|aghi
    text0|A0|ohij
    text0|u0|?ijk
    text0|U0+-jkl

    Remark: I just saved the to-be-send message to disk and looked at it with an hex editor, and all I see is one 0xF6 char per above line. iow, no pre-post conversions that I can see.

    Regards,
    Rudy Wieser

    You're right! It's not MSOE fault. It's my fault!

    My 40tude Dialog default settings was set to UTF-8. IIRC, the default was `Default` which presumably the system's ANSI locale. I chose to set it to
    UTF-8 because I was seeing more and more of UTF-8 encoded messages.

    MSOE doesn't emit `Content-Type` header and only use system's ANSI locale.
    That causes the character corruption. And I quoted the message which is
    already misinterpreted. My bad. I'll remember about this problem next time.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From J. P. Gilliver@G6JPG@255soft.uk to comp.os.ms-windows.programmer.win32,alt.windows7.general on Tue Nov 18 14:43:57 2025
    From Newsgroup: alt.windows7.general

    On 2025/11/18 14:18:49, JJ wrote:
    On Mon, 17 Nov 2025 10:01:46 +0100, R.Wieser wrote:
    [attribution line had been snipped]
    Can you post the same word again along with different words
    with the o-umlaut character from your MSOE?

    Your wish is my command :-)

    textN++N++N++N++N++N++def
    textN++N++N++N++N++N++efg
    textN++N++N++N++N++N++fgh
    textN++N++N++N++N++N++ghi
    textN++N++N++N++N++N++hij
    textN++N++N++N++N++N++ijk
    textN++N++N++N++N++N++jkl
    If it helps: I am using Thunderbird. When I viewed the above block in
    RW's post (2025/11/17, 9:1:46 my time), it appeared as seven "words",
    starting with "text" and an o umlaut, then six more letters, the first
    "word" being text||abcdef. When I viewed it as quoted in JJ's post
    (14:18:49 my time), and in my compose editor as I am typing now, each
    "word" starts with text, then six identical characters, then three
    letters (def for the first one). The identical characters are here a
    vertical diamond with a background coloured question mark inside them.
    []
    You're right! It's not MSOE fault. It's my fault!
    []
    --
    J. P. Gilliver. UMRA: 1960/<1985 MB++G()ALIS-Ch++(p)Ar++T+H+Sh0!:`)DNAf
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Tue Nov 18 15:44:03 2025
    From Newsgroup: alt.windows7.general

    JJ,

    You're right! It's not MSOE fault. It's my fault!

    You're taking the blame for what one of your programs did and not throw them under the bus ? As their manager thats big of you. :-)

    My 40tude Dialog default settings was set to UTF-8.
    ...
    MSOE doesn't emit `Content-Type` header

    After my previous post I thought of checking just that, and noticed the
    same. I have absolutily no idea what the default encoding of MSOE in its "Plain text" mode is

    and only use system's ANSI locale.

    Thats sounds logical.

    And I quoted the message which is already misinterpreted

    That was what I assumed happened. Now there is only the question why J.P. sees every-over-0x7F(?) char displayed seperatily and as unrecognised.

    Thanks for the feedback.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From J. P. Gilliver@G6JPG@255soft.uk to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 00:33:55 2025
    From Newsgroup: alt.windows7.general

    On 2025/11/18 14:44:3, R.Wieser wrote:
    JJ,

    You're right! It's not MSOE fault. It's my fault!

    You're taking the blame for what one of your programs did and not throw them under the bus ? As their manager thats big of you. :-)

    My 40tude Dialog default settings was set to UTF-8.
    ...
    MSOE doesn't emit `Content-Type` header

    After my previous post I thought of checking just that, and noticed the same. I have absolutily no idea what the default encoding of MSOE in its "Plain text" mode is

    and only use system's ANSI locale.

    Thats sounds logical.

    And I quoted the message which is already misinterpreted

    That was what I assumed happened. Now there is only the question why J.P. sees every-over-0x7F(?) char displayed seperatily and as unrecognised.
    []
    Well, I haven't looked at what's there in raw, but I often see
    characters like the || exactly as intended.
    --
    J. P. Gilliver. UMRA: 1960/<1985 MB++G()ALIS-Ch++(p)Ar++T+H+Sh0!:`)DNAf
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 08:10:00 2025
    From Newsgroup: alt.windows7.general

    J.P. ,

    That was what I assumed happened. Now there is only the question
    why J.P. sees every-over-0x7F(?) char displayed seperatily and as
    unrecognised.

    Well, I haven't looked at what's there in raw, but I often see
    characters like the % exactly as intended.

    It might have something to do with that my posts doesn't include a character encoding (in the headers), but JJ's advertised his as being UTF-8.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 14:25:28 2025
    From Newsgroup: alt.windows7.general

    On Mon, 17 Nov 2025 10:01:46 +0100, R.Wieser wrote:

    Your wish is my command :-)

    text0|e0#udef
    text0|a0#nefg
    text0|e0|Nfgh
    text0|A0|aghi
    text0|A0|ohij
    text0|u0|?ijk
    text0|U0+-jkl

    Remark: I just saved the to-be-send message to disk and looked at it with an hex editor, and all I see is one 0xF6 char per above line. iow, no pre-post conversions that I can see.

    Regards,
    Rudy Wieser

    This got me wondering...

    Does MSOE only emit the `Content-Type` header if there's actually a Unicode character outside of system's ANSI character set in the message body?

    What will happen if the message include Unicode characters outside of
    system's ANSI character set? e.g. the U+2602 Umbrella character (rye) from `Symbols & Dingbats` Unicode subrange copied from Character Map.

    I suppose the Umbrella character above is shown properly in MSOE?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 14:25:31 2025
    From Newsgroup: alt.windows7.general

    On Tue, 18 Nov 2025 14:43:57 +0000, J. P. Gilliver wrote:

    If it helps: I am using Thunderbird. When I viewed the above block in
    RW's post (2025/11/17, 9:1:46 my time), it appeared as seven "words", starting with "text" and an o umlaut, then six more letters, the first
    "word" being text%abcdef. When I viewed it as quoted in JJ's post
    (14:18:49 my time), and in my compose editor as I am typing now, each
    "word" starts with text, then six identical characters, then three
    letters (def for the first one). The identical characters are here a
    vertical diamond with a background coloured question mark inside them.

    I still keep my 40tude to be UTF-8 by default, so my messages still contain corrupted characters.

    40tude corrupted characters end up as two Low Surrogates. 2 Unicode
    characters. 6 raw UTF-8 characters.

    AFAIK, Surrogates only come as High & Low Surrogates. So Low & Low
    Surrogates is an invalid surrogate character sequence.

    I think Thunderbird sees the 6 raw UTF-8 characters as invalid (after
    decoded), and display them as 6 Replacement Character Unicode characters (0xFFFD).
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From J. P. Gilliver@G6JPG@255soft.uk to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 07:35:14 2025
    From Newsgroup: alt.windows7.general

    On 2025/11/19 7:25:28, JJ wrote:
    On Mon, 17 Nov 2025 10:01:46 +0100, R.Wieser wrote:

    Your wish is my command :-)

    textN++N++N++N++N++N++def
    textN++N++N++N++N++N++efg
    textN++N++N++N++N++N++fgh
    textN++N++N++N++N++N++ghi
    textN++N++N++N++N++N++hij
    textN++N++N++N++N++N++ijk
    textN++N++N++N++N++N++jkl

    I've just done "view raw" on that as received, and this is what it shows
    - of course, I don't know if _this_ will get corrupted!:

    text|!-|rCU|!-#-udef
    text|!-|rCa|!-#-nefg
    text|!-|+a|!-|-Nfgh
    text|!-|++|!-|-aghi
    text|!-|rCO|!-|-ohij
    text|!-|rCo|!-|-?ijk
    text|!-|+i|!-+--jkl

    (In case it _does_ get corrupted: the first one looks to me like "text",
    i grave, mu [micro], what looks like a comma, i grave again, plusminus,
    pound, then "def".)


    Remark: I just saved the to-be-send message to disk and looked at it with an
    hex editor, and all I see is one 0xF6 char per above line. iow, no pre-post
    conversions that I can see.

    Regards,
    Rudy Wieser

    This got me wondering...

    Does MSOE only emit the `Content-Type` header if there's actually a Unicode character outside of system's ANSI character set in the message body?

    What will happen if the message include Unicode characters outside of system's ANSI character set? e.g. the U+2602 Umbrella character (rye) from `Symbols & Dingbats` Unicode subrange copied from Character Map.

    I suppose the Umbrella character above is shown properly in MSOE?
    It shows correctly as I view your post and in this reply; in raw, it
    shows as (|o-LrCU), that is a circumflex, tilde, what looks like a comma.
    --
    J. P. Gilliver. UMRA: 1960/<1985 MB++G()ALIS-Ch++(p)Ar++T+H+Sh0!:`)DNAf

    User Error: Replace user, hit any key to continue.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 14:37:08 2025
    From Newsgroup: alt.windows7.general

    On Wed, 19 Nov 2025 08:10:00 +0100, R.Wieser wrote:
    J.P. ,

    That was what I assumed happened. Now there is only the question
    why J.P. sees every-over-0x7F(?) char displayed seperatily and as
    unrecognised.

    Well, I haven't looked at what's there in raw, but I often see
    characters like the % exactly as intended.

    It might have something to do with that my posts doesn't include a character encoding (in the headers), but JJ's advertised his as being UTF-8.

    Regards,
    Rudy Wieser

    I've change my 40tude's default character set to `Default`, which should be autodetect. The quoted o-umlaut should be copied properly. AFAIK, Usenet is initially for OEM/ANSI character set anyway.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 14:46:23 2025
    From Newsgroup: alt.windows7.general

    On Wed, 19 Nov 2025 07:35:14 +0000, J. P. Gilliver wrote:

    I've just done "view raw" on that as received, and this is what it shows
    - of course, I don't know if _this_ will get corrupted!:

    text|!-|rCU|!-#-udef
    text|!-|rCa|!-#-nefg
    text|!-|+a|!-|-Nfgh
    text|!-|++|!-|-aghi
    text|!-|rCO|!-|-ohij
    text|!-|rCo|!-|-?ijk
    text|!-|+i|!-+--jkl

    (In case it _does_ get corrupted: the first one looks to me like "text",
    i grave, mu [micro], what looks like a comma, i grave again, plusminus, pound, then "def".)

    Well, the raw view displays the raw UTF-8 encoded message. It's not yet
    decoded to Unicode code points.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 11:22:13 2025
    From Newsgroup: alt.windows7.general

    JJ,

    Does MSOE only emit the `Content-Type` header if there's actually a
    Unicode character outside of system's ANSI character set in the message
    body?

    I have no idea. And it will be hard to test it, as I can only paste ANSI
    into my posts.

    e.g. the U+2602 Umbrella character (?)
    ...
    I suppose the Umbrella character above is shown properly in MSOE?

    All I see there is a single "unknown character" rectangle (which /does/
    means that MSOE recognises it as an UTF-8 sequence. Odd).

    I wanted to suggest that that is perhaps becaused my fonts are old, but checking that character (as "&#x2602;") in a webpage in Firefox does show
    it.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 12:28:34 2025
    From Newsgroup: alt.windows7.general

    JJ,

    40tude corrupted characters end up as two Low Surrogates.
    2 Unicode characters. 6 raw UTF-8 characters.

    That part is clear. No idea why your newsgroup reader didn't just emit a single 0xFFFD sequence though.

    The question is why J.P.'s newsgroup reader

    1) doesn't recognise them as two valid UTF-8 sequences (and show a single rectange for each of them)

    2) doesn't display the bytes as ANSI characters.

    I think Thunderbird sees the 6 raw UTF-8 characters as invalid
    (after decoded)

    As undisplayable ? Thats rather possible.

    Still, why does J.P.'s newsgroup reader not show a single "unknown
    character" rectangle per sequence (as my MSOE does) ?

    The only thing I can currently think of is that showing "unrecognised" place-holders for all bytes is some kind of (rather useless) byte-counting debugging aid (which also makes it harder to guess what the word should have been).l

    By the way: Your newsgroup reader seemingly emits an UTF-8 sequence that indicates an unknown UTF-8 sequence has been encountered, but that sequence
    is in turn not recognised by either J.P.'s or my newsgroup reader. :-( :-)

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 12:45:54 2025
    From Newsgroup: alt.windows7.general

    J.P.,

    I've just done "view raw" on that as received, and this is what it
    shows of course, I don't know if _this_ will get corrupted!:
    ...
    (In case it _does_ get corrupted: the first one looks to me like "text",
    i grave, mu [micro], what looks like a comma, i grave again, plusminus, pound, then "def".)

    This is a hex-dump of it (between the "text" and the "def" parts) :

    C3 AD C2 B5 E2 80 9A C3 AD C2 B1 C2 A3

    We (me too) see 6 chars, the above shows there should be 13 of them. With only the "E2 80" looking like it could be an UTF-8 sequence. Curious.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Wed Nov 19 13:08:29 2025
    From Newsgroup: alt.windows7.general

    JJ,

    The quoted o-umlaut should be copied properly.

    (a bit above it)

    but I often see characters like the % exactly as intended.

    Yep, I now see the o-umlaut in all its splendid glory. :-)

    AFAIK, Usenet is initially for OEM/ANSI character set anyway.

    Nope. Initially just 7-bit bytes, /mostly/ ASCII (EBDIC anyone ?).

    But, over the years, has been expanded to allow for several character encodings as well as being able to include attachments.

    Heck, you can now even post HTML-ed messages - though, luckily, my newsgroup host (Ethernal September) rejects them (at least for newsgroups like this one).

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Thu Nov 20 14:54:30 2025
    From Newsgroup: alt.windows7.general

    On Wed, 19 Nov 2025 11:22:13 +0100, R.Wieser wrote:

    JJ,

    Does MSOE only emit the `Content-Type` header if there's actually a
    Unicode character outside of system's ANSI character set in the message
    body?

    I have no idea. And it will be hard to test it, as I can only paste ANSI into my posts.

    e.g. the U+2602 Umbrella character (?)
    ....
    I suppose the Umbrella character above is shown properly in MSOE?

    All I see there is a single "unknown character" rectangle (which /does/ means that MSOE recognises it as an UTF-8 sequence. Odd).

    I wanted to suggest that that is perhaps becaused my fonts are old, but checking that character (as "&#x2602;") in a webpage in Firefox does show it.

    Regards,
    Rudy Wieser

    Seems like MSOE's Unicode support is for HTML content only.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From J. P. Gilliver@G6JPG@255soft.uk to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 22 01:29:55 2025
    From Newsgroup: alt.windows7.general

    On 2025/11/20 9:16:55, R.Wieser wrote:
    JJ,

    Seems like MSOE's Unicode support is for HTML content only.

    That would not at all surprise me, seeing that the mode I'm composing it is called "plain text". You're not getting any plainer than ASCII. :-)

    But I just remebered that in Windows you can enter special chars using ALT 0{number} on the numeric keypad. Testing ... Nope, doesn't work either .

    Regards,
    Rudy Wieser


    You have to have Num Lock on - or did in earlier versions of Windows;
    I've just tried (in Windows 10) and it works whether you do or don't. (I
    don't know when that changed.)

    There are also two (maybe more?) sets of code sequences; I remember the
    code for mu - the micro sign, -| - as Alt-230; the ones CharMap tell you
    about have a leading 0, e. g. for -| Alt-0181.
    (Trying them adjacent, in case they're actually different: -|-|. Those
    look identical to me in this compose window.)
    --
    J. P. Gilliver. UMRA: 1960/<1985 MB++G()ALIS-Ch++(p)Ar++T+H+Sh0!:`)DNAf

    Our thorny national debate about Brexit could turn out to be irrelevant.
    Sooner or later the EU as we know it may no longer be there for us to
    leave. - Katya Adler, BBC Europe editor (RT, 2017/2/4-10)
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 22 07:40:52 2025
    From Newsgroup: alt.windows7.general

    J.P. ,

    But I just remebered that in Windows you can enter special chars using
    ALT 0{number} on the numeric keypad. Testing ... Nope, doesn't work
    either .
    ...
    You have to have Num Lock on - or did in earlier versions of Windows;

    In my version of Windows, off. Otherwise I get the cursor keys mode.

    I've just tried (in Windows 10) and it works whether you do or don't.
    (I don't know when that changed.)

    I'm working with a full, external keyboard here. It might differ for
    build-in laptop keyboards.

    There are also two (maybe more?) sets of code sequences; I remember
    the code for mu - the micro sign, | - as Alt-230; the ones CharMap tell
    you
    about have a leading 0, e. g. for | Alt-0181.

    The one without the leading zero is for ANSI (local) chars, while the one
    with the "0" prefix is for Unicode chars.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 22 13:48:10 2025
    From Newsgroup: alt.windows7.general

    On Sat, 22 Nov 2025 01:29:55 +0000, J. P. Gilliver wrote:
    You have to have Num Lock on - or did in earlier versions of Windows;
    I've just tried (in Windows 10) and it works whether you do or don't. (I don't know when that changed.)

    There are also two (maybe more?) sets of code sequences; I remember the
    code for mu - the micro sign, | - as Alt-230; the ones CharMap tell you
    about have a leading 0, e. g. for | Alt-0181.
    (Trying them adjacent, in case they're actually different: ||. Those
    look identical to me in this compose window.)

    NumLock doesn't have to be on.

    Alt+### generate OEM character from decimal character code. Supported by
    BIOS. In Windows, it will generate the wanted OEM character, but with
    different character code, because Windows is ANSI/Unicode native.

    Alt+0### generate ANSI character from decimal character code. Supported
    since Windows 2.0.

    Alt+NumpadPlus+xxxxx generate Unicode character from hexadecimal Unicode
    code point (1-5 hex digits). Supported since Windows 7. Numbers can be
    inputted from both Numpad and main key section. e.g. Alt+NumpadPlus+20ac generates the Euro symbol `n` (U+20AC). AutoHotkey v1.0 can be used to implement this function in Windows XP. Note: later AutoHotkey versions
    (v1.1+) no longer run in Windows XP (sic).

    And in Wordpad including MS-Word, Alt+X on one character or up to 5 hex
    digits Unicode code point (text selection will specify/restrict the length), will convert between the two forms. It's not a built-in support from the RichEdit UI control, unfortunately.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From JJ@jj4public@gmail.com to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 22 13:50:11 2025
    From Newsgroup: alt.windows7.general

    On Sat, 22 Nov 2025 13:48:10 +0700, JJ wrote:

    Alt+### generate OEM character from decimal character code. Supported by BIOS.

    Of course, Windows handles everything and no longer rely on BIOS for this.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From J. P. Gilliver@G6JPG@255soft.uk to comp.os.ms-windows.programmer.win32,alt.windows7.general on Sat Nov 22 11:19:58 2025
    From Newsgroup: alt.windows7.general

    On 2025/11/22 6:40:52, R.Wieser wrote:
    J.P. ,

    But I just remebered that in Windows you can enter special chars using
    ALT 0{number} on the numeric keypad. Testing ... Nope, doesn't work
    either .
    ...
    You have to have Num Lock on - or did in earlier versions of Windows;

    In my version of Windows, off. Otherwise I get the cursor keys mode.

    I've just tried (in Windows 10) and it works whether you do or don't.
    (I don't know when that changed.)

    I'm working with a full, external keyboard here. It might differ for build-in laptop keyboards.

    I'm using a full external one here too, -|-| and it works on or off (I
    just verified it), but I definitely remember when I first learnt about
    it (might have been in Windows 3.x days) it definitely had to be in one
    state (I think on).>
    There are also two (maybe more?) sets of code sequences; I remember
    the code for mu - the micro sign, -+ - as Alt-230; the ones CharMap tell
    you
    about have a leading 0, e. g. for -+ Alt-0181.

    The one without the leading zero is for ANSI (local) chars, while the one with the "0" prefix is for Unicode chars.

    Yes. 230 is the only one I remember for the ANSI set, probably because I
    worked in electronics and it was used a lot as the micro- prefix.>
    Regards,
    Rudy Wieser


    --
    J. P. Gilliver. UMRA: 1960/<1985 MB++G()ALIS-Ch++(p)Ar++T+H+Sh0!:`)DNAf
    --- Synchronet 3.21a-Linux NewsLink 1.2