• Stripping trailing blanks from Pascal type strings

    From Robert Prins@robert@prino.org to alt.lang.asm on Sun Aug 17 21:15:20 2025
    From Newsgroup: alt.lang.asm

    The maximum length of the string is 63 characters, and on average there are 44 trailing blanks, and right now I have some 25,740 strings to process. There's the obvious, from the existing code:

    mov ecx, il_dloc + 1

    @21:
    dec ecx
    cmp byte ptr [ebx + ecx + offset lift_list.deploc], " "
    je @21

    mov byte ptr [ebx + offset lift_list.deploc], cl

    or, in pure Pascal:

    _i:= length(lift_ptr^.deploc) + 1;
    repeat
    dec(_i);
    until lift_ptr^.deploc[_i] <> ' ';
    lift_ptr^.deploc[0]:= char(_i);

    Obviously I could just look for four trailing blanks (EAX), add 3 to ecx on non-blank, look for two TB's (AX), and then for 1 TB (AL), but is there anything
    cleverer? And FWIW, this is 32-bit code, so no RAX.

    And no, it's not going to noticeably affect the running time of the program, I'm
    just curious.

    Thanks,

    Robert
    --
    Robert AH Prins
    robert(a)prino(d)org
    The hitchhiking grandfather - https://prino.neocities.org/
    Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to alt.lang.asm on Mon Aug 18 09:47:25 2025
    From Newsgroup: alt.lang.asm

    Robert,

    There's the obvious, from the existing code:
    [snip code]

    You're a bit light on information, so I have to make some assumptions :

    "il_dloc" is a constant, holding the value 63

    Your current strings all have the length 63 (in their first byte)

    Danger will robinson, danger !

    Assume an empty string, padded with 63 spaces. ECX will count down to Zero, and only you being lucky that the string-length byte is *not* 32 the loops check will exit.

    Just imagine /someone/ has stored 32 spaces (padded with another 31 spaces) and correctly set the strings length. Yep, the string-length byte would
    look like another space, causing ECX to underflow and wrap around. (don't
    say never, as muphies law tries to tell us. :-) )

    iow, for code less likely to bomb you need to check for ECX underflowing (becoming less than One) too.

    -- part #2

    Obviously I could just look for four trailing blanks (EAX), add 3 to ecx
    on non-blank, look for two TB's (AX), and then for 1 TB (AL), but is there anything cleverer?

    Not that I know of.

    Other than getting rid of that "add 3 to ecx on non-blank" that is : assume that ECX points to the /last/ to-check character (start with il_dloc + 4)
    and check it, and the three chars before it.

    Don't forget to check for ECX underflow.


    Though if speed is the target, you could take a look at "scasd" (moving "backwards" over the string), followed by a "scasw" and "scasb". More
    setup (and teardown) needed, but /possibly/ faster in execution.

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Robert Prins@robert@prino.org to alt.lang.asm on Mon Aug 18 14:05:39 2025
    From Newsgroup: alt.lang.asm

    On 2025-08-18 07:47, R.Wieser wrote:
    Robert,

    There's the obvious, from the existing code:
    [snip code]

    You're a bit light on information, so I have to make some assumptions :

    "il_dloc" is a constant, holding the value 63

    Your current strings all have the length 63 (in their first byte)

    I read a full line, use 2 YMM moves to move the 63(+1) characters into the string, set string[0] to 63, and scan. Empty strings are not possible, shortest
    ones are 4 characters.

    Danger will robinson, danger !

    Assume an empty string, padded with 63 spaces. ECX will count down to Zero, and only you being lucky that the string-length byte is *not* 32 the loops check will exit.

    Just imagine /someone/ has stored 32 spaces (padded with another 31 spaces) and correctly set the strings length. Yep, the string-length byte would
    look like another space, causing ECX to underflow and wrap around. (don't
    say never, as muphies law tries to tell us. :-) )

    iow, for code less likely to bomb you need to check for ECX underflowing (becoming less than One) too.

    The program is processing my own data, and there might be a handful of others using it, a few weeks ago, for the first time in a couple of years someone asked
    for a copy, but still hasn't used it. It's written (nominally) in Virtual Pascal, but probably well over 90% is nowadays inline assembler, including significant use of post-Pentium (MMX, SSEx and even AVX instructions) Source can
    be found at <https://prino.neocities.org/miscellaneous/hitchtech.html>, in lift32bit.rar

    -- part #2

    Obviously I could just look for four trailing blanks (EAX), add 3 to ecx
    on non-blank, look for two TB's (AX), and then for 1 TB (AL), but is there >> anything cleverer?

    Not that I know of.

    Other than getting rid of that "add 3 to ecx on non-blank" that is : assume that ECX points to the /last/ to-check character (start with il_dloc + 4)
    and check it, and the three chars before it.

    Don't forget to check for ECX underflow.


    Though if speed is the target, you could take a look at "scasd" (moving "backwards" over the string), followed by a "scasw" and "scasb". More
    setup (and teardown) needed, but /possibly/ faster in execution.

    I think the overhead of SCASx is way too high for such short strings.

    FWIW, the program runs in less than 0.5 seconds in the assembler-ised version, and in 0.75 seconds in the 99.9% pure Pascal version, and speeding it up, ha, ha, ha, is just something to keep my mind engaged.

    Robert

    PS: And no, given this reply by a long-time experienced Pascal user, the pure Pascal Version will not work in FPC and I've never tried to compile it in Delphi
    6 or the old free TurboDelphi. Then again, neither of them would ever be able to
    perform the equivalent of my manual optimisations.
    --
    Robert AH Prins
    robert(a)prino(d)org
    The hitchhiking grandfather - https://prino.neocities.org/
    Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From R.Wieser@address@is.invalid to alt.lang.asm on Mon Aug 18 14:31:59 2025
    From Newsgroup: alt.lang.asm

    Robert,

    ...
    Danger will robinson, danger !
    ...
    The program is processing my own data, and there might be a handful of others using it

    I already got the feeling that that might be the case, but as I was not sure
    I wanted to point it out. Why exactly I don't know, as AFAIR you are no newbie ....

    I think the overhead of SCASx is way too high for such short strings.

    Yep. Smart solutions (including your suggested compare in four-byte slices) often have that problem. :-\

    FWIW, the program runs in less than 0.5 seconds in the assembler-ised version, and in 0.75 seconds in the 99.9% pure Pascal version, and
    speeding it up, ha, ha, ha, is just something to keep my mind engaged.

    You're talking to someone who writes Assembly programs for Windows as a
    hobby. I know the feeling. :-)

    Regards,
    Rudy Wieser


    --- Synchronet 3.21a-Linux NewsLink 1.2