• 0 vs. translate-none

    From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Wed Sep 17 16:53:05 2025
    From Newsgroup: comp.lang.forth

    This posting is a more general reflection about designing types in
    Forth; it just uses recognizers as example.

    The original proposal for recognizers had R:FAIL as the result of a
    recognizer that did not recognize the input. Later that was renamed
    to NOTFOUND; then there was a proposal where 0 would be used instead,
    and Bernd Paysan changed all the uses of NOTFOUND in Gforth to 0.
    Finally, on last Thursday the committee decided to go with
    TRANSLATE-NONE for that result.

    Bernd Paysan thought that it would be easy to change back to a non-0
    value for TRANSLATE-NONE, by looking at the patch that changed
    NOTFOUND to 0. However, in the meantime there has been more work
    done, so it's not so easy.

    E.g., there was a word

    ?FOUND ( x -- x )

    that would throw -13 if x=0. This word was used both with the result
    of recognizers and with nt|0 or xt|0. Fortunately, in this case the
    cases were easy to recognize, and they are now addressed by two words: ?REC-FOUND (for recognizer results) and ?FOUND (for x|0).

    What do we learn from this? Merging two previously separate types
    such that they are dealt with (partly) the same words (e.g., 0= in
    this case) is easy, as is mixing two kinds of sand. Separating two
    previously (partly) merged types to use type-specific words is a lot
    more work.

    You can fake it by defining 0 CONSTANT TRANSLATE-NONE, but then you
    never know if your code ports to other systems where TRANSLATE-NONE is non-zero. For now Gforth does it this way, but I don't expect that to
    be the final stage.

    Should we prefer to separate types or merge them? Both approaches
    have advantages:

    * With separate words for dealing with the types, we can easily find
    all uses of that type and do something about it. E.g., a while ago
    I changed the cs-item (control-flow stack item) in Gforth from three
    to four cells. This was relatively easy because there are only a
    few words in Gforth that deal with cs items.

    * With a merged approach, we can use the same words for dealing with
    several types, with further words building upon these words (instead
    of having to define the further words n times for n types). But
    that makes the separation problem even harder.

    Overall, I think that the merged approach is preferable, but only if
    you are sure that you will never need to separate the types (whether
    due to a committee decision or because some new requirement means that
    you have to change the representation of the type).

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Fri Sep 19 12:24:03 2025
    From Newsgroup: comp.lang.forth

    Am 17.09.2025 um 18:53 schrieb Anton Ertl:
    This posting is a more general reflection about designing types in
    Forth; it just uses recognizers as example.

    My gut feeling is that the standard Forth word zoo is already big
    enough. Why should one define return types now, after more than half
    a century of Forth's history? This is beyond me.

    However, if it's only for a text description of those recognizers,
    I wouldn't mind.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Fri Sep 19 15:45:47 2025
    From Newsgroup: comp.lang.forth

    minforth <minforth@gmx.net> writes:
    Am 17.09.2025 um 18:53 schrieb Anton Ertl:
    This posting is a more general reflection about designing types in
    Forth; it just uses recognizers as example.

    My gut feeling is that the standard Forth word zoo is already big
    enough. Why should one define return types now, after more than half
    a century of Forth's history? This is beyond me.

    There is a discussion of 0 vs. R:FAIL in at least one of the versions
    of Matthias Trute's proposal, and a less thorough discussion of 0
    vs. NOTFOUND (=R:FAIL) in Bernd Paysan's proposal. To see an
    advantage of TRANSLATE-NONE (=R:FAIL=NOTFOUND), consider:

    : postpone ( "name" -- ) \ core
    \g Compiles the compilation semantics of @i{name}.
    parse-name rec-forth postponing ; immediate

    REC-FORTH is the system recognizer, which produces a translation (a representation of the parsed word/number/etc. on the stack). If the
    recognizer does not recognize "name", it produces TRANSLATE-NONE.

    POSTPONING then performs the postpone action for the translation. A straightforward implementation of translation tokens (the top cell of
    a translation)

    create translate-...
    ' ... , \ interpreting
    ' ... , \ compiling
    ' ... , \ postponing

    For TRANSLATE-NONE that would be:

    : undefined-word #-13 throw ;

    create translate-none
    ' undefined-word ,
    ' undefined-word ,
    ' undefined-word ,

    And POSTPONING can then be implemented as:

    : postponing ( translation -- )
    2 cells + @ execute ;

    However, if you use 0 instead of TRANSLATE-NONE, you would have to
    special-case that in POSTPONING:

    : postponing ( translation -- )
    dup 0= if -13 throw then
    2 cells + @ execute ;

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From peter@peter.noreply@tin.it to comp.lang.forth on Fri Sep 19 19:39:29 2025
    From Newsgroup: comp.lang.forth

    On Fri, 19 Sep 2025 15:45:47 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    minforth <minforth@gmx.net> writes:
    Am 17.09.2025 um 18:53 schrieb Anton Ertl:
    This posting is a more general reflection about designing types in
    Forth; it just uses recognizers as example.

    My gut feeling is that the standard Forth word zoo is already big
    enough. Why should one define return types now, after more than half
    a century of Forth's history? This is beyond me.

    There is a discussion of 0 vs. R:FAIL in at least one of the versions
    of Matthias Trute's proposal, and a less thorough discussion of 0
    vs. NOTFOUND (=R:FAIL) in Bernd Paysan's proposal. To see an
    advantage of TRANSLATE-NONE (=R:FAIL=NOTFOUND), consider:

    : postpone ( "name" -- ) \ core
    \g Compiles the compilation semantics of @i{name}.
    parse-name rec-forth postponing ; immediate

    REC-FORTH is the system recognizer, which produces a translation (a representation of the parsed word/number/etc. on the stack). If the recognizer does not recognize "name", it produces TRANSLATE-NONE.

    POSTPONING then performs the postpone action for the translation. A straightforward implementation of translation tokens (the top cell of
    a translation)

    create translate-...
    ' ... , \ interpreting
    ' ... , \ compiling
    ' ... , \ postponing

    For TRANSLATE-NONE that would be:

    : undefined-word #-13 throw ;

    create translate-none
    ' undefined-word ,
    ' undefined-word ,
    ' undefined-word ,

    And POSTPONING can then be implemented as:

    : postponing ( translation -- )
    2 cells + @ execute ;

    However, if you use 0 instead of TRANSLATE-NONE, you would have to special-case that in POSTPONING:

    : postponing ( translation -- )
    dup 0= if -13 throw then
    2 cells + @ execute ;

    - anton

    On lxf64 each individual recognizer returns 0 when no match was found.
    The last recognizer to be tested is REC-ABORT (same as your REC-NONE).
    As a consequence REC-FORTH will never fail!
    I organize the recognizers in a linked list. REC-ABORT is hard coded to always be last.

    The interpret word thus becomes very simple

    M: STATE-TRANSLATING ( trans -- ) \ get the right xt for the current state
    2 state @ + cells+ @ execute ;

    : INTERPRET2 ( -- )
    begin parse-name
    dup while
    forth-recognize state-translating
    repeat 2drop
    ?stack ;

    BR
    Peter

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Sat Sep 20 07:25:54 2025
    From Newsgroup: comp.lang.forth

    peter <peter.noreply@tin.it> writes:
    On lxf64 each individual recognizer returns 0 when no match was found.
    The last recognizer to be tested is REC-ABORT (same as your REC-NONE).

    REC-NONE is the neutral element of recognizer sequences, i.e., as far
    as the sequence is concerned, a noop. You can prepend REC-NONE to a
    recognizer sequence and the sequence will produce the same result.
    The implementation of REC-NONE is:

    : rec-none ( c-addr u -- translation )
    2drop translate-none ;

    I doubt that your REC-ABORT works like that. My guess is that your
    REC-ABORT is:

    : rec-abort -13 throw ;

    and I will work with that guess in the folloing.

    As a consequence REC-FORTH will never fail!

    In the proposal, any recognizer and recognizer sequence, including
    that in REC-FORTH, can have TRANSLATE-NONE as a result, which
    indicates that the recognizer (sequence) did not recognize the string.

    The interpret word thus becomes very simple

    M: STATE-TRANSLATING ( trans -- ) \ get the right xt for the current state
    2 state @ + cells+ @ execute ;

    : INTERPRET2 ( -- )
    begin parse-name
    dup while
    forth-recognize state-translating
    repeat 2drop
    ?stack ;

    The same implementation can be used with the proposal (but it calls FORTH-RECOGNIZE by a new name: REC-FORTH) and TRANSLATE-NONE.
    Compared to what I presented, the order of xts in the
    TRANSLATE-... tables is reversed, so POSTPONING would become even
    simpler:

    : POSTPONING ( translation -- )
    @ execute ;

    One difference is that, for an unrecignized string, the -13 throw is
    done later, when performing the action of the translation.

    The benefit of having TRANSLATE-NONE and doing the -13 throw in its
    actions, instead of hard-coded in REC-FORTH is that REC-FORTH contains
    just another recognizer sequence, that recognizer sequences behave
    like recognizers, and thus are nestable, and that you can write code
    like

    ( c-addr u ) rec-something ( translation ) postponing

    and it will work without you having to put REC-ABORT at the end of REC-SOMETHING.

    However, the current proposal does not propose to standardize
    POSTPONING etc., but leaves it to the standard text interpreter and
    standard POSTPONE to perform the translation actions. So, as long as
    we don't standardize these words, one could also have a recognizer
    sequence

    ' rec-abort ' rec-forth 2 recognizer-sequence: rec-forth-abort

    and let the text interpreter and POSTPONE call REC-FORTH-ABORT instead
    of REC-FORTH. But if we want to leave the option open to standardize POSTPONING etc. in the future, the proposed approach is more flexible.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From peter@peter.noreply@tin.it to comp.lang.forth on Sat Sep 20 10:34:35 2025
    From Newsgroup: comp.lang.forth

    On Sat, 20 Sep 2025 07:25:54 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
    peter <peter.noreply@tin.it> writes:
    On lxf64 each individual recognizer returns 0 when no match was found.
    The last recognizer to be tested is REC-ABORT (same as your REC-NONE).

    REC-NONE is the neutral element of recognizer sequences, i.e., as far
    as the sequence is concerned, a noop. You can prepend REC-NONE to a recognizer sequence and the sequence will produce the same result.
    The implementation of REC-NONE is:

    : rec-none ( c-addr u -- translation )
    2drop translate-none ;

    I doubt that your REC-ABORT works like that. My guess is that your
    REC-ABORT is:

    : rec-abort -13 throw ;
    :NONAME -13 throw ;
    dup-t
    dup-t
    CREATE TRANSLATE-ABORT
    ,-d-t ,-d-t ,-d-t
    : REC-ABORT ( addr len -- nt)
    >msg translate-abort ;
    The -t and -d-t endings are due to this being metacompiled
    msg saves the string to be able to print the name in the abort message

    and I will work with that guess in the folloing.

    As a consequence REC-FORTH will never fail!

    In the proposal, any recognizer and recognizer sequence, including
    that in REC-FORTH, can have TRANSLATE-NONE as a result, which
    indicates that the recognizer (sequence) did not recognize the string.

    The interpret word thus becomes very simple

    M: STATE-TRANSLATING ( trans -- ) \ get the right xt for the current state
    2 state @ + cells+ @ execute ;

    : INTERPRET2 ( -- )
    begin parse-name
    dup while
    forth-recognize state-translating
    repeat 2drop
    ?stack ;

    The same implementation can be used with the proposal (but it calls FORTH-RECOGNIZE by a new name: REC-FORTH) and TRANSLATE-NONE.
    Compared to what I presented, the order of xts in the
    TRANSLATE-... tables is reversed, so POSTPONING would become even
    simpler:

    : POSTPONING ( translation -- )
    @ execute ;
    : postpone ( "name" -- )
    parse-name forth-recognize @ execute ; immediate

    One difference is that, for an unrecignized string, the -13 throw is
    done later, when performing the action of the translation.

    The benefit of having TRANSLATE-NONE and doing the -13 throw in its
    actions, instead of hard-coded in REC-FORTH is that REC-FORTH contains
    just another recognizer sequence, that recognizer sequences behave
    like recognizers, and thus are nestable, and that you can write code
    like

    ( c-addr u ) rec-something ( translation ) postponing

    and it will work without you having to put REC-ABORT at the end of REC-SOMETHING.

    However, the current proposal does not propose to standardize
    POSTPONING etc., but leaves it to the standard text interpreter and
    standard POSTPONE to perform the translation actions. So, as long as
    we don't standardize these words, one could also have a recognizer
    sequence

    ' rec-abort ' rec-forth 2 recognizer-sequence: rec-forth-abort

    and let the text interpreter and POSTPONE call REC-FORTH-ABORT instead
    of REC-FORTH. But if we want to leave the option open to standardize POSTPONING etc. in the future, the proposed approach is more flexible.

    - anton
    Apart from the names that has changed (I have not updated them yet) I see only minor differences. One being the individual recognizer returning 0 on fail.
    I implemented yur proposal from February and it has worked as expected.
    The float recognizer has been a cleanup removing deferred words and now done instead when the float package is included.
    I am using a linked list of recognizers but that is only an implementation detail.
    I do not see that the "standard" would mandate an array.
    I have introduce a vocabulary like word for managing the recognizers
    It looks like
    ' rec-local recognizer: Locals-recognizer
    This does 3 things
    - It gives the recognizer a name.
    - It inserts the recognizer in the list just behind the number recognizers
    - Executing it moves the recognizers to the top of the list
    I have also adjuster ORDER to also chow the recognizers
    order
    Order: $0070|01C0 Forth
    $0070|01E8 Root
    Current: $0070|01C0 Forth
    Loaded recognizers:
    $0070|18E8 Locals-recognizer
    $0070|0648 Word-recognizer
    $0070|0620 Number-recognizer
    $0070|1C88 Float-recognizer
    $0070|1D28 String-recognizer
    $0070|19B0 Tick-recognizer
    $0070|1948 To-recognizer
    $0070|1970 To2-recognizer
    $0070|1A20 Only-recognizer
    $0070|0600 Abort not found
    ok
    BR
    Peter
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Sat Sep 20 16:08:03 2025
    From Newsgroup: comp.lang.forth

    peter <peter.noreply@tin.it> writes:
    On Sat, 20 Sep 2025 07:25:54 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    On lxf64 each individual recognizer returns 0 when no match was found.
    The last recognizer to be tested is REC-ABORT (same as your REC-NONE). >>=20
    REC-NONE is the neutral element of recognizer sequences, i.e., as far
    as the sequence is concerned, a noop. You can prepend REC-NONE to a
    recognizer sequence and the sequence will produce the same result.
    The implementation of REC-NONE is:
    =20
    : rec-none ( c-addr u -- translation )
    2drop translate-none ;
    =20
    I doubt that your REC-ABORT works like that. My guess is that your
    REC-ABORT is:
    =20
    : rec-abort -13 throw ;

    :NONAME -13 throw ;=20
    dup-t
    dup-t
    CREATE TRANSLATE-ABORT
    ,-d-t ,-d-t ,-d-t


    : REC-ABORT ( addr len -- nt)
    >msg translate-abort ;

    Ok, your TRANSLATE-ABORT is TRANSLATE-NONE, and your REC-ABORT is
    REC-NONE.

    msg saves the string to be able to print the name in the abort message

    Might be cleaner than Gforth's current mechanism (I don't remember how
    that works).

    Apart from the names that has changed (I have not updated them yet) I see only >minor differences. One being the individual recognizer returning 0 on fail.

    So the usual recognizers return 0 for not-recognized, but REC-FORTH
    returns TRANSLATE-NONE. Interesting twist. What about other
    recognizer sequences?

    I am using a linked list of recognizers but that is only an implementation = >detail.
    I do not see that the "standard" would mandate an array.

    The standard does not mandate any particular implementation of
    REC-FORTH or of recognizer sequences.


    I have introduce a vocabulary like word for managing the recognizers

    It looks like

    ' rec-local recognizer: Locals-recognizer

    This does 3 things

    - It gives the recognizer a name.

    - It inserts the recognizer in the list just behind the number recognizers

    But before other recognizers behind the number recognizers?

    - Executing it moves the recognizers to the top of the list

    I am confused. Under what circumstances does the "insert just behind"
    happen, and when "move to the top"?

    And what scenario do you have in mind that makes this behaviour
    useful?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From peter@peter.noreply@tin.it to comp.lang.forth on Sat Sep 20 19:08:27 2025
    From Newsgroup: comp.lang.forth

    On Sat, 20 Sep 2025 16:08:03 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    On Sat, 20 Sep 2025 07:25:54 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    On lxf64 each individual recognizer returns 0 when no match was found.
    The last recognizer to be tested is REC-ABORT (same as your REC-NONE). >>=20
    REC-NONE is the neutral element of recognizer sequences, i.e., as far
    as the sequence is concerned, a noop. You can prepend REC-NONE to a
    recognizer sequence and the sequence will produce the same result.
    The implementation of REC-NONE is:
    =20
    : rec-none ( c-addr u -- translation )
    2drop translate-none ;
    =20
    I doubt that your REC-ABORT works like that. My guess is that your
    REC-ABORT is:
    =20
    : rec-abort -13 throw ;

    :NONAME -13 throw ;=20
    dup-t
    dup-t
    CREATE TRANSLATE-ABORT
    ,-d-t ,-d-t ,-d-t


    : REC-ABORT ( addr len -- nt)
    >msg translate-abort ;

    Ok, your TRANSLATE-ABORT is TRANSLATE-NONE, and your REC-ABORT is
    REC-NONE.

    msg saves the string to be able to print the name in the abort message

    Might be cleaner than Gforth's current mechanism (I don't remember how
    that works).

    Apart from the names that has changed (I have not updated them yet) I see only
    minor differences. One being the individual recognizer returning 0 on fail.

    So the usual recognizers return 0 for not-recognized, but REC-FORTH
    returns TRANSLATE-NONE. Interesting twist. What about other
    recognizer sequences?

    There are no other recognizer sequences. I have not found a use case for that. I guess I would implement them to also return TRANSLATE-NONE.



    I am using a linked list of recognizers but that is only an implementation = >detail.
    I do not see that the "standard" would mandate an array.

    The standard does not mandate any particular implementation of
    REC-FORTH or of recognizer sequences.


    I have introduce a vocabulary like word for managing the recognizers

    It looks like

    ' rec-local recognizer: Locals-recognizer

    This does 3 things

    - It gives the recognizer a name.

    - It inserts the recognizer in the list just behind the number recognizers

    But before other recognizers behind the number recognizers?

    Yes


    - Executing it moves the recognizers to the top of the list

    I am confused. Under what circumstances does the "insert just behind" happen, and when "move to the top"?

    Insert behind only when the recognizer is created. To still have a standard system.
    To the top in all other cases.
    ONLY for example contains

    : ONLY ( -- )
    1 #order ! root-wordlist context !
    number-recognizer word-recognizer locals-recognizer ;


    And what scenario do you have in mind that makes this behaviour
    useful?

    It has been usefull when testing new recognizers.
    Otherwise I never change the ordering

    Peter



    - anton


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Sat Sep 20 17:57:30 2025
    From Newsgroup: comp.lang.forth

    peter <peter.noreply@tin.it> writes:
    It has been usefull when testing new recognizers.

    I have been doing that by directly calling the recognizer. E.g.

    s" `dup" rec-tick . .
    s" `fkjfd" rec-tick .
    s" dup" rec-tick .

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From peter@peter.noreply@tin.it to comp.lang.forth on Sat Sep 20 20:55:12 2025
    From Newsgroup: comp.lang.forth

    On Sat, 20 Sep 2025 17:57:30 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    It has been usefull when testing new recognizers.

    I have been doing that by directly calling the recognizer. E.g.

    s" `dup" rec-tick . .
    s" `fkjfd" rec-tick .
    s" dup" rec-tick .

    I do that also.
    A bit difficult with the string recognizer!

    I have recently been testing a new name-recognizer where I have
    translate-name and translate-name-immediate. I load it, put it before
    the normal name recognizer and load the ANS test suit.

    Peter


    - anton


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Sun Sep 21 10:37:18 2025
    From Newsgroup: comp.lang.forth

    In article <20250920103435.00002fbe@tin.it>,
    peter <peter.noreply@tin.it> wrote:
    On Sat, 20 Sep 2025 07:25:54 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    On lxf64 each individual recognizer returns 0 when no match was found.
    The last recognizer to be tested is REC-ABORT (same as your REC-NONE).

    REC-NONE is the neutral element of recognizer sequences, i.e., as far
    as the sequence is concerned, a noop. You can prepend REC-NONE to a
    recognizer sequence and the sequence will produce the same result.
    The implementation of REC-NONE is:

    : rec-none ( c-addr u -- translation )
    2drop translate-none ;

    I doubt that your REC-ABORT works like that. My guess is that your
    REC-ABORT is:

    : rec-abort -13 throw ;

    :NONAME -13 throw ;
    dup-t
    dup-t
    CREATE TRANSLATE-ABORT
    ,-d-t ,-d-t ,-d-t


    : REC-ABORT ( addr len -- nt)
    >msg translate-abort ;

    The -t and -d-t endings are due to this being metacompiled
    msg saves the string to be able to print the name in the abort message


    and I will work with that guess in the folloing.

    As a consequence REC-FORTH will never fail!

    In the proposal, any recognizer and recognizer sequence, including
    that in REC-FORTH, can have TRANSLATE-NONE as a result, which
    indicates that the recognizer (sequence) did not recognize the string.

    The interpret word thus becomes very simple

    M: STATE-TRANSLATING ( trans -- ) \ get the right xt for the current state
    2 state @ + cells+ @ execute ;

    : INTERPRET2 ( -- )
    begin parse-name
    dup while
    forth-recognize state-translating
    repeat 2drop
    ?stack ;

    The same implementation can be used with the proposal (but it calls
    FORTH-RECOGNIZE by a new name: REC-FORTH) and TRANSLATE-NONE.
    Compared to what I presented, the order of xts in the
    TRANSLATE-... tables is reversed, so POSTPONING would become even
    simpler:

    : POSTPONING ( translation -- )
    @ execute ;

    : postpone ( "name" -- )
    parse-name forth-recognize @ execute ; immediate


    One difference is that, for an unrecignized string, the -13 throw is
    done later, when performing the action of the translation.

    The benefit of having TRANSLATE-NONE and doing the -13 throw in its
    actions, instead of hard-coded in REC-FORTH is that REC-FORTH contains
    just another recognizer sequence, that recognizer sequences behave
    like recognizers, and thus are nestable, and that you can write code
    like

    ( c-addr u ) rec-something ( translation ) postponing

    and it will work without you having to put REC-ABORT at the end of
    REC-SOMETHING.

    However, the current proposal does not propose to standardize
    POSTPONING etc., but leaves it to the standard text interpreter and
    standard POSTPONE to perform the translation actions. So, as long as
    we don't standardize these words, one could also have a recognizer
    sequence

    ' rec-abort ' rec-forth 2 recognizer-sequence: rec-forth-abort

    and let the text interpreter and POSTPONE call REC-FORTH-ABORT instead
    of REC-FORTH. But if we want to leave the option open to standardize
    POSTPONING etc. in the future, the proposed approach is more flexible.

    - anton

    Apart from the names that has changed (I have not updated them yet) I see only >minor differences. One being the individual recognizer returning 0 on fail.

    I implemented yur proposal from February and it has worked as expected.
    The float recognizer has been a cleanup removing deferred words and now done >instead when the float package is included.

    I am using a linked list of recognizers but that is only an implementation detail.
    I do not see that the "standard" would mandate an array.

    I have introduce a vocabulary like word for managing the recognizers

    It looks like

    ' rec-local recognizer: Locals-recognizer

    This does 3 things

    - It gives the recognizer a name.

    - It inserts the recognizer in the list just behind the number recognizers

    - Executing it moves the recognizers to the top of the list

    I have also adjuster ORDER to also chow the recognizers

    order
    Order: $0070|01C0 Forth
    $0070|01E8 Root

    Current: $0070|01C0 Forth

    Loaded recognizers:
    $0070|18E8 Locals-recognizer
    $0070|0648 Word-recognizer
    $0070|0620 Number-recognizer
    $0070|1C88 Float-recognizer
    $0070|1D28 String-recognizer
    $0070|19B0 Tick-recognizer
    $0070|1948 To-recognizer
    $0070|1970 To2-recognizer
    $0070|1A20 Only-recognizer
    $0070|0600 Abort not found

    ok

    BR
    Peter

    In ciforth all "recognizers" are prefixes in the ONLY wordlist,
    during startup (You can define recognizers yourself in any wordlist).
    They are modular, they have no connection and are governed
    by the regular search order.

    0-9 and - + recognizes numbers
    ^ & recognizes control and regular chars
    ' recognizes a name and turns it in an address that identifies
    the word
    " recognizes strings

    All these leave a constant.
    It is easy to add

    { An anonymous code sequence that can be EXECUTEd.
    Also this leave a constant. No need for a distinction between
    :NONAME and [: .
    $ 0x # % 00 To recognizes numbers in a certain base, hex, dec, bin, octal

    Change number prefix to include floating point.

    All this is accomplised by a PREFIX flag (compare IMMEDIATE)
    and a provision that advances the interpreter pointer by
    the length of the prefixes, not by the length of the word passed
    to it.

    It is believable that the system presented above is more powerful,
    but I love to see examples what it can do that warrant the
    complexity. Also I love to see if the examples can't be
    done with my simpler setup.
    Recently I presented the Roman number prefix. How does
    that look in the recognizer presented.

    Groetjes Albert
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Sun Sep 21 13:56:04 2025
    From Newsgroup: comp.lang.forth

    Am 21.09.2025 um 10:37 schrieb albert@spenarnc.xs4all.nl:
    <snip>
    All this is accomplised by a PREFIX flag (compare IMMEDIATE)
    and a provision that advances the interpreter pointer by
    the length of the prefixes, not by the length of the word passed
    to it.

    It is believable that the system presented above is more powerful,
    but I love to see examples what it can do that warrant the
    complexity. Also I love to see if the examples can't be
    done with my simpler setup.
    Recently I presented the Roman number prefix. How does
    that look in the recognizer presented.

    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Sun Sep 21 14:44:06 2025
    From Newsgroup: comp.lang.forth

    In article <mja7ekFbiv5U1@mid.individual.net>,
    minforth <minforth@gmx.net> wrote:
    Am 21.09.2025 um 10:37 schrieb albert@spenarnc.xs4all.nl:
    <snip>
    All this is accomplised by a PREFIX flag (compare IMMEDIATE)
    and a provision that advances the interpreter pointer by
    the length of the prefixes, not by the length of the word passed
    to it.

    It is believable that the system presented above is more powerful,
    but I love to see examples what it can do that warrant the
    complexity. Also I love to see if the examples can't be
    done with my simpler setup.
    Recently I presented the Roman number prefix. How does
    that look in the recognizer presented.

    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    The ability to use suffixes doesn't contribute necessarily to
    power. It adds confusion and difficulty to parse.
    Try it
    Program a suffix aided recognizer for Roman numbers:
    MMXIIX:R

    Groetjes Albert
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Sun Sep 21 12:33:15 2025
    From Newsgroup: comp.lang.forth

    peter <peter.noreply@tin.it> writes:
    On Sat, 20 Sep 2025 17:57:30 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    It has been usefull when testing new recognizers.

    I have been doing that by directly calling the recognizer. E.g.

    s" `dup" rec-tick . .
    s" `fkjfd" rec-tick .
    s" dup" rec-tick .

    I do that also.
    A bit difficult with the string recognizer!

    In theory:

    s\" \"abc\"" rec-string scan-translate-string = . type \ -1 "abc"

    Trying to pass the result of the REC-STRING to INTERPRETING has
    revealed interesting restrictions in Gforth's implementation of the
    results of REC-STRING:

    s\" \"abc\"" rec-string interpreting
    *the terminal*:26:25: error: Scanned string not in input buffer

    parse-name "abc" rec-string scan-translate-string = . type \ -1 "abc" parse-name "abc" rec-string interpreting
    *the terminal*:29:29: error: Invalid memory address
    parse-name "abc" rec-string >>>interpreting<<<

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Sun Sep 21 12:50:14 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    It is believable that the system presented above is more powerful,
    but I love to see examples what it can do that warrant the
    complexity. Also I love to see if the examples can't be
    done with my simpler setup.
    Recently I presented the Roman number prefix. How does
    that look in the recognizer presented.

    I have already presented a recognizer for roman numerals in <2025Jun8.183524@mips.complang.tuwien.ac.at>, along with .ROMAN in <2025Jun9.082338@mips.complang.tuwien.ac.at> and variations on
    ROMAN>N? used for the recognizer in <2025Jun9.091538@mips.complang.tuwien.ac.at>. The examples are:

    MCMXLVIII . \ 1948
    mcmxlviii . \ error: undefined word
    MIM \ error: undefined word
    L . \ 50
    LLL \ error: undefined word
    MCMXLVIII LXXVII + . \ 2025
    MCMXLVIII LXXVII + .roman \ MMXXV

    Unlike your approach, one can write roman numerals in the same way
    that we learned in school. Does it warrant the complexity? I think
    that already the benefit of not having to FIND-NAME for all prefixes
    of a word we search for is a good reason to avoid your approach.

    The interface to the recognizer proposal has seen some renaming and
    other changes in the recent committee meeting, and you find the
    updated code for recognizing and printing roman numerals in:

    https://www.complang.tuwien.ac.at/forth/programs/roman-numerals.4th

    The recognizer stuff conforms with the recent proposal, the rest of
    the code uses Gforth extensions.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Sun Sep 21 17:46:54 2025
    From Newsgroup: comp.lang.forth

    Am 21.09.2025 um 14:44 schrieb albert@spenarnc.xs4all.nl:
    In article <mja7ekFbiv5U1@mid.individual.net>,
    minforth <minforth@gmx.net> wrote:
    Am 21.09.2025 um 10:37 schrieb albert@spenarnc.xs4all.nl:
    <snip>
    All this is accomplised by a PREFIX flag (compare IMMEDIATE)
    and a provision that advances the interpreter pointer by
    the length of the prefixes, not by the length of the word passed
    to it.

    It is believable that the system presented above is more powerful,
    but I love to see examples what it can do that warrant the
    complexity. Also I love to see if the examples can't be
    done with my simpler setup.
    Recently I presented the Roman number prefix. How does
    that look in the recognizer presented.

    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    The ability to use suffixes doesn't contribute necessarily to
    power. It adds confusion and difficulty to parse.
    Try it
    Program a suffix aided recognizer for Roman numbers:
    MMXIIX:R


    YMMV but in my world it is more important to have better readability
    of matrix equations, than to minimise Forth parser pain. ;-)

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From peter@peter.noreply@tin.it to comp.lang.forth on Sun Sep 21 18:39:18 2025
    From Newsgroup: comp.lang.forth

    On Sun, 21 Sep 2025 12:33:15 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    On Sat, 20 Sep 2025 17:57:30 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    It has been usefull when testing new recognizers.

    I have been doing that by directly calling the recognizer. E.g.

    s" `dup" rec-tick . .
    s" `fkjfd" rec-tick .
    s" dup" rec-tick .

    I do that also.
    A bit difficult with the string recognizer!

    In theory:

    s\" \"abc\"" rec-string scan-translate-string = . type \ -1 "abc"

    Trying to pass the result of the REC-STRING to INTERPRETING has
    revealed interesting restrictions in Gforth's implementation of the
    results of REC-STRING:

    s\" \"abc\"" rec-string interpreting
    *the terminal*:26:25: error: Scanned string not in input buffer

    parse-name "abc" rec-string scan-translate-string = . type \ -1 "abc" parse-name "abc" rec-string interpreting
    *the terminal*:29:29: error: Invalid memory address
    parse-name "abc" rec-string >>>interpreting<<<

    - anton

    This is the closest I have come

    s\" \"" rec-string Hej Peter" interpreting cr type
    Hej Peter ok

    I still gives me an extra space that is needed to separate rec-string
    from the continuation of the string. It works because rec-string does
    the parsing of the rest of the string. If the parsing is done in
    interpreting the string would have to be after that!

    Peter

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Sun Sep 21 21:39:02 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep21.145014@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    Unlike your approach, one can write roman numerals in the same way
    that we learned in school.

    I think that it is good that the roman numerals are distinguished.
    VI M L X become suddenly reserved words. There is a reason that we
    prefer $CD and $DEADBEEF. If I drop the requirement of a uniform
    prefix, I can simplify my approach too.
    On the other hand you are certainly able to prefix the roman numerals
    if you wish with recognizers. So there is not much difference.

    Does it warrant the complexity? I think
    that already the benefit of not having to FIND-NAME for all prefixes
    of a word we search for is a good reason to avoid your approach.

    That is probably a misunderstanding of my approach.
    I understand that in your recognizer system all recognizers are tried
    in succession.
    Using a prefix like 0x for hex, the lookup for 0x is the same
    as the lookup for `` 1 CONSTANT 0x '' , so no separate mechanism is
    needed.
    Only after the word is found, a prefix is handled differently, compare immediate words. Such lookup is probably even less effort.

    Assume we have a PREFIX $ for hex.
    Think of a unix environment, where $ is used for environment variables
    and we want 0x for hex.

    NAMESPACE unix \ That is VOCABULARY with a built-in ALSO
    unix DEFINITIONS
    ' $ ALIAS 0x

    \ Warning: is not unique.
    : $ PARSE-NAME GET-ENV POSTPONE DLITERAL ; IMMEDIATE PREFIX

    ...
    ...
    PREVIOUS DEFINITIONS
    As soon as you kick unix out of the search order, $ is again the
    prefix for hex and 0xCD is no more recognized.
    In my opinion, using the regular search order for prefixes is
    probably an advantage.

    P.S. GET-ENV leaves a double. Adding POSTPONE DLITERAL makes that
    $XXXX can be used in compilation mode.

    The interface to the recognizer proposal has seen some renaming and
    other changes in the recent committee meeting, and you find the
    updated code for recognizing and printing roman numerals in:

    https://www.complang.tuwien.ac.at/forth/programs/roman-numerals.4th

    The recognizer stuff conforms with the recent proposal, the rest of
    the code uses Gforth extensions.

    - anton

    Groetjes Albert
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Mon Sep 22 06:56:31 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep21.145014@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    Unlike your approach, one can write roman numerals in the same way
    that we learned in school.

    I think that it is good that the roman numerals are distinguished.
    VI M L X become suddenly reserved words.

    No, they become numbers. However, vi Vi vI m l x are not recognized as
    numbers, and you can use them to invoke the word. In Gforth, you can
    also use

    roman?L
    name?L

    to distinguish them.

    There is a reason that we
    prefer $CD and $DEADBEEF.

    Yes, the number prefixes have advantages when writing programs. I
    don't think I am going to be writing programs with roman numerals.

    Does it warrant the complexity? I think
    that already the benefit of not having to FIND-NAME for all prefixes
    of a word we search for is a good reason to avoid your approach.

    That is probably a misunderstanding of my approach.
    I understand that in your recognizer system all recognizers are tried
    in succession.
    Using a prefix like 0x for hex, the lookup for 0x is the same
    as the lookup for `` 1 CONSTANT 0x '' , so no separate mechanism is
    needed.
    Only after the word is found, a prefix is handled differently, compare >immediate words. Such lookup is probably even less effort.

    My understaning is that if the user types

    123456789

    into the text interpreter, your text interpreter will search for

    123456789
    12345678
    1234567
    123456
    12345
    1234
    123
    12
    1

    and fail at the first 8 attempts, and finally match the ninth, and
    only then try to convert the string into a number. By contrast, with recognizers, every recognizer (including REC-NAME) only has to deal
    with the full string, and most other recognizers have simpler and
    cheaper checks than REC-NAME.

    Assume we have a PREFIX $ for hex.
    Think of a unix environment, where $ is used for environment variables
    and we want 0x for hex.

    NAMESPACE unix \ That is VOCABULARY with a built-in ALSO
    unix DEFINITIONS
    ' $ ALIAS 0x

    \ Warning: is not unique.
    : $ PARSE-NAME GET-ENV POSTPONE DLITERAL ; IMMEDIATE PREFIX

    ...
    ...
    PREVIOUS DEFINITIONS
    As soon as you kick unix out of the search order, $ is again the
    prefix for hex and 0xCD is no more recognized.

    Gforth has REC-ENV and that is active by default, and there is usually
    no reason to eliminate it from the system recognizer sequence. You
    write ${HOME}.

    P.S. GET-ENV leaves a double. Adding POSTPONE DLITERAL makes that
    $XXXX can be used in compilation mode.

    It seems that your approach embraces state-smartness. By contrast,
    one benefit of recognizers is that they make it unnecessary to use
    words like S" or TO that often are implemented as state-smart words,
    or require unconventional mechanisms to avoid that.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Mon Sep 22 10:09:18 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep22.085631@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    <SNIP>
    Only after the word is found, a prefix is handled differently, compare >>immediate words. Such lookup is probably even less effort.

    My understaning is that if the user types

    123456789

    into the text interpreter, your text interpreter will search for

    123456789
    12345678
    1234567
    123456
    12345
    1234
    123
    12
    1

    and fail at the first 8 attempts, and finally match the ninth, and
    only then try to convert the string into a number. By contrast, with >recognizers, every recognizer (including REC-NAME) only has to deal
    with the full string, and most other recognizers have simpler and
    cheaper checks than REC-NAME.

    No. 123456789 is looked up in the Forth wordlist, fails, then in the
    minimum search wordlist.

    ' & ^ 0 1 2 3 4 5 6 7
    8 9 - + " FORTH

    1234556789 matches the prefix 1. Then 1 does its thing,
    recognizes the number, and decides to leave it on the stack or
    compile code for it. Isn't that smart? (or is it the way Forth
    works from day one?)


    Assume we have a PREFIX $ for hex.
    Think of a unix environment, where $ is used for environment variables
    and we want 0x for hex.

    NAMESPACE unix \ That is VOCABULARY with a built-in ALSO
    unix DEFINITIONS
    ' $ ALIAS 0x

    \ Warning: is not unique.
    : $ PARSE-NAME GET-ENV POSTPONE DLITERAL ; IMMEDIATE PREFIX

    ...
    ...
    PREVIOUS DEFINITIONS
    As soon as you kick unix out of the search order, $ is again the
    prefix for hex and 0xCD is no more recognized.

    Gforth has REC-ENV and that is active by default, and there is usually
    no reason to eliminate it from the system recognizer sequence. You
    write ${HOME}.

    Does that invalidate the example?


    P.S. GET-ENV leaves a double. Adding POSTPONE DLITERAL makes that
    $XXXX can be used in compilation mode.

    It seems that your approach embraces state-smartness. By contrast,
    one benefit of recognizers is that they make it unnecessary to use
    words like S" or TO that often are implemented as state-smart words,
    or require unconventional mechanisms to avoid that.

    No I don't. Numbers have always been state-smart, although you
    won't admit to it.
    In my system you can't postpone numbers, so that cannot lead to
    problems. "AAP" is recognized but
    POSTPONE "AAP"
    POSTPONE 12345
    is rejected.
    So "AAP" is a generalised number and share the property that it
    may be state-smart but like numbers there are no evil consequences.
    Not by clever planning, but by sound design.

    P.S. I don't intend to present or defend PREFIX as an alternative to
    the recognizer proposals, but it is more sound than people think.
    Also it is lean, so advantageous for vintage systems.


    - anton

    Groetjes Albert
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Mon Sep 22 08:39:34 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep22.085631@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    <SNIP>
    Only after the word is found, a prefix is handled differently, compare >>>immediate words. Such lookup is probably even less effort.

    My understaning is that if the user types

    123456789

    into the text interpreter, your text interpreter will search for

    123456789
    12345678
    1234567
    123456
    12345
    1234
    123
    12
    1

    and fail at the first 8 attempts, and finally match the ninth, and
    only then try to convert the string into a number. By contrast, with >>recognizers, every recognizer (including REC-NAME) only has to deal
    with the full string, and most other recognizers have simpler and
    cheaper checks than REC-NAME.

    No. 123456789 is looked up in the Forth wordlist, fails, then in the
    minimum search wordlist.

    ' & ^ 0 1 2 3 4 5 6 7
    8 9 - + " FORTH

    1234556789 matches the prefix 1.

    How so? Linear search through the wordlist, with prefix matching?
    That's even slower than the approach outlined above (when that
    approach is implemented using hash tables).

    And how does matching "0r" for your roman numerals work, if "0rM"
    matches the prefix "0"?

    Assume we have a PREFIX $ for hex.
    Think of a unix environment, where $ is used for environment variables >>>and we want 0x for hex.

    NAMESPACE unix \ That is VOCABULARY with a built-in ALSO
    unix DEFINITIONS
    ' $ ALIAS 0x

    \ Warning: is not unique.
    : $ PARSE-NAME GET-ENV POSTPONE DLITERAL ; IMMEDIATE PREFIX

    ...
    ...
    PREVIOUS DEFINITIONS
    As soon as you kick unix out of the search order, $ is again the
    prefix for hex and 0xCD is no more recognized.

    Gforth has REC-ENV and that is active by default, and there is usually
    no reason to eliminate it from the system recognizer sequence. You
    write ${HOME}.

    Does that invalidate the example?

    It means that we can mix standard syntax for hex numbers and
    environment variables freely, without having to shadow one with the
    other, or kicking one to be able to use the other.

    P.S. GET-ENV leaves a double. Adding POSTPONE DLITERAL makes that
    $XXXX can be used in compilation mode.

    It seems that your approach embraces state-smartness. By contrast,
    one benefit of recognizers is that they make it unnecessary to use
    words like S" or TO that often are implemented as state-smart words,
    or require unconventional mechanisms to avoid that.

    No I don't. Numbers have always been state-smart, although you
    won't admit to it.

    A state-smart 123 would behave like

    : 123
    123 state @ if postpone literal then ; immediate

    By contrast, a normal 123 behaves like

    : 123
    123 ;

    Here is a test

    : p123 postpone 123 ; : test [ p123 ] ; test .

    Let's see how it works (outputs are shown with preceding "\ "):

    : 123 \ compiling
    123 state @ if postpone literal then ; immediate
    \ *terminal*:2:40: warning: defined literal 123 as word ok
    : p123 postpone 123 ; : test [ p123 ] ; test .
    \ *the terminal*:3:39: error: Control structure mismatch
    \ : p123 postpone 123 ; : test [ p123 ] >>>;<<< test

    Now with a freshly started system:

    : 123 \ compiling
    123 ;
    \ *terminal*:2:7: warning: defined literal 123 as word ok
    : p123 postpone 123 ; : test [ p123 ] ; test . \ 123 ok

    Now with a freshly started system:

    : p123 postpone 123 ; : test [ p123 ] ; test . \ 123 ok

    The last example uses REC-NUM to recognize 123. It behaves like the
    normal (not state-smart) word 123, showing that numbers are not
    state-smart.

    You may say that in a traditional system the last test will not work,
    because POSTPONE does not work with numbers. That's true, but not
    proof of any state-smartness. It just means that we have to look at
    the implementation to decide it. And in the traditional
    implementation the text interpreter decides whether to perform the interpretation or compilation semantics of a number, whereas in a
    state-smart word, these two semantics are the same (immediate), and
    the word itself decides when it is run what to do, based on STATE. No
    such thing happens with numbers, so they are not state-smart, not even
    in a traditional system.

    In my system you can't postpone numbers, so that cannot lead to
    problems.

    That is certainly a good idea if you have made the mistake of
    embracing state-smartness in your system, but it is another
    disadvantage of your approach compared to recognizers.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Mon Sep 22 23:03:33 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep22.103934@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep22.085631@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    <SNIP>
    Only after the word is found, a prefix is handled differently, compare >>>>immediate words. Such lookup is probably even less effort.

    My understaning is that if the user types

    123456789

    into the text interpreter, your text interpreter will search for

    123456789
    12345678
    1234567
    123456
    12345
    1234
    123
    12
    1

    and fail at the first 8 attempts, and finally match the ninth, and
    only then try to convert the string into a number. By contrast, with >>>recognizers, every recognizer (including REC-NAME) only has to deal
    with the full string, and most other recognizers have simpler and
    cheaper checks than REC-NAME.

    No. 123456789 is looked up in the Forth wordlist, fails, then in the >>minimum search wordlist.

    ' & ^ 0 1 2 3 4 5 6 7
    8 9 - + " FORTH

    1234556789 matches the prefix 1.

    How so? Linear search through the wordlist, with prefix matching?
    That's even slower than the approach outlined above (when that
    approach is implemented using hash tables).

    I use a simple Forth and there linear search is acceptable for me.


    And how does matching "0r" for your roman numerals work, if "0rM"
    matches the prefix "0"?

    Normal precedence rules for Forth. 0r is later defined so it is
    probed earlier.


    Assume we have a PREFIX $ for hex.
    Think of a unix environment, where $ is used for environment variables >>>>and we want 0x for hex.

    NAMESPACE unix \ That is VOCABULARY with a built-in ALSO
    unix DEFINITIONS
    ' $ ALIAS 0x

    \ Warning: is not unique.
    : $ PARSE-NAME GET-ENV POSTPONE DLITERAL ; IMMEDIATE PREFIX

    ...
    ...
    PREVIOUS DEFINITIONS
    As soon as you kick unix out of the search order, $ is again the
    prefix for hex and 0xCD is no more recognized.

    Gforth has REC-ENV and that is active by default, and there is usually
    no reason to eliminate it from the system recognizer sequence. You
    write ${HOME}.

    Does that invalidate the example?

    It means that we can mix standard syntax for hex numbers and
    environment variables freely, without having to shadow one with the
    other, or kicking one to be able to use the other.


    P.S. GET-ENV leaves a double. Adding POSTPONE DLITERAL makes that
    $XXXX can be used in compilation mode.

    It seems that your approach embraces state-smartness. By contrast,
    one benefit of recognizers is that they make it unnecessary to use
    words like S" or TO that often are implemented as state-smart words,
    or require unconventional mechanisms to avoid that.

    No I don't. Numbers have always been state-smart, although you
    won't admit to it.

    A state-smart 123 would behave like

    : 123
    123 state @ if postpone literal then ; immediate

    By contrast, a normal 123 behaves like

    : 123
    123 ;

    Here is a test

    : p123 postpone 123 ; : test [ p123 ] ; test .

    Let's see how it works (outputs are shown with preceding "\ "):

    : 123 \ compiling
    123 state @ if postpone literal then ; immediate
    \ *terminal*:2:40: warning: defined literal 123 as word ok
    : p123 postpone 123 ; : test [ p123 ] ; test .
    \ *the terminal*:3:39: error: Control structure mismatch
    \ : p123 postpone 123 ; : test [ p123 ] >>>;<<< test

    Now with a freshly started system:

    : 123 \ compiling
    123 ;
    \ *terminal*:2:7: warning: defined literal 123 as word ok
    : p123 postpone 123 ; : test [ p123 ] ; test . \ 123 ok

    Now with a freshly started system:

    : p123 postpone 123 ; : test [ p123 ] ; test . \ 123 ok

    The last example uses REC-NUM to recognize 123. It behaves like the
    normal (not state-smart) word 123, showing that numbers are not
    state-smart.

    You may say that in a traditional system the last test will not work,
    because POSTPONE does not work with numbers. That's true, but not
    proof of any state-smartness. It just means that we have to look at
    the implementation to decide it. And in the traditional
    implementation the text interpreter decides whether to perform the >interpretation or compilation semantics of a number, whereas in a
    state-smart word, these two semantics are the same (immediate), and
    the word itself decides when it is run what to do, based on STATE. No
    such thing happens with numbers, so they are not state-smart, not even
    in a traditional system.

    You have tried to explain this to me several times, but this is the clearest.

    I terminate denotation words with [COMPILE] LITERAL or [COMPILE] DLITERAL. Suppose I change it to a system where INTERPRET checks whether an
    immediate word left something on the stack ( assuming a separate
    compilation check) and only in compilation mode adds a LITERAL
    (compiles LIT and the number).
    In that case denotations doesn't end with [COMPILE] LITERAL/DLITERAL.
    Would that be an acceptable implementation?


    In my system you can't postpone numbers, so that cannot lead to
    problems.

    That is certainly a good idea if you have made the mistake of
    embracing state-smartness in your system, but it is another
    disadvantage of your approach compared to recognizers.

    To end the controversy, maybe I have to admit I have smart numbers,
    but I manage to be ISO-94 compliant.

    "AAP"
    OK
    POSTPONE "AAP"
    POSTPONE "AAP" ? ciforth ERROR # 15 : CANNOT FIND WORD TO BE POSTPONED

    Maybe not ISO-2012 compliant.


    - anton

    Groetjes Albert
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Tue Sep 23 17:00:34 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep22.103934@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote: >>albert@spenarnc.xs4all.nl writes:
    1234556789 matches the prefix 1.

    How so? Linear search through the wordlist, with prefix matching?
    That's even slower than the approach outlined above (when that
    approach is implemented using hash tables).

    I use a simple Forth and there linear search is acceptable for me.

    That's fine for you, but some of us compile substantial programs
    and/or use lots of lookups in wordlists at run time, and therefore
    need dictionary search to be fast.

    And how does matching "0r" for your roman numerals work, if "0rM"
    matches the prefix "0"?

    Normal precedence rules for Forth. 0r is later defined so it is
    probed earlier.

    So defining a prefix "0r" shadows all earlier-defined words starting
    with "0r"? One has to choose the prefixes well, but the same is true
    for what recognizers should recognize.

    You may say that in a traditional system the last test will not work, >>because POSTPONE does not work with numbers. That's true, but not
    proof of any state-smartness. It just means that we have to look at
    the implementation to decide it. And in the traditional
    implementation the text interpreter decides whether to perform the >>interpretation or compilation semantics of a number, whereas in a >>state-smart word, these two semantics are the same (immediate), and
    the word itself decides when it is run what to do, based on STATE. No
    such thing happens with numbers, so they are not state-smart, not even
    in a traditional system.

    You have tried to explain this to me several times, but this is the clearest.

    I terminate denotation words with [COMPILE] LITERAL or [COMPILE] DLITERAL. >Suppose I change it to a system where INTERPRET checks whether an
    immediate word left something on the stack ( assuming a separate
    compilation check) and only in compilation mode adds a LITERAL
    (compiles LIT and the number).
    In that case denotations doesn't end with [COMPILE] LITERAL/DLITERAL.
    Would that be an acceptable implementation?

    Given that you make it impossible to use the prefixes in a way where
    the problems with state-smartness show up (by disallowing to tick or
    postpone the prefixed words), even a state-smart implementation is
    acceptable.

    However, LITERAL is a standard word that a conforming implementation
    cannot implement in a state-smart way.

    : lit, postpone literal ;
    : foo [ 1 lit, ] ;
    foo . \ 1

    (Gforth, iForth, SwiftForth64, and VFX64 process this example correctly).

    To end the controversy, maybe I have to admit I have smart numbers,
    but I manage to be ISO-94 compliant.

    "AAP"
    OK
    POSTPONE "AAP"
    POSTPONE "AAP" ? ciforth ERROR # 15 : CANNOT FIND WORD TO BE POSTPONED

    Maybe not ISO-2012 compliant.

    Forth-2012 does not include recognizers, much less a string
    recognizer. And the result of the recent meeting is that the proposal
    does not include a string recognizer, either (but it provides words
    that allow one to add a string recognizer). BTW, Forth-2012 is not an
    ISO standard.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Tue Sep 23 17:23:05 2025
    From Newsgroup: comp.lang.forth

    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Tue Sep 23 17:25:18 2025
    From Newsgroup: comp.lang.forth

    peter <peter.noreply@tin.it> writes:
    This is the closest I have come

    s\" \"" rec-string Hej Peter" interpreting cr type
    Hej Peter ok

    I still gives me an extra space that is needed to separate rec-string
    from the continuation of the string.

    Yes, if you want to avoid that space, you lose a lot of the
    interactive testing benefit.

    But at least that works on lxf64.

    It works because rec-string does
    the parsing of the rest of the string. If the parsing is done in
    interpreting the string would have to be after that!

    The recommendation in the proposals is that recognizers don't do
    additional parsing (in order to be callable by, e.g., LOCATE or other
    tools), and that the translator should do the parsing.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From peter@peter.noreply@tin.it to comp.lang.forth on Tue Sep 23 22:25:52 2025
    From Newsgroup: comp.lang.forth

    On Tue, 23 Sep 2025 17:25:18 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    This is the closest I have come

    s\" \"" rec-string Hej Peter" interpreting cr type
    Hej Peter ok

    I still gives me an extra space that is needed to separate rec-string
    from the continuation of the string.

    Yes, if you want to avoid that space, you lose a lot of the
    interactive testing benefit.

    But at least that works on lxf64.

    If I do like this it works
    s\" \"" drop 0 rec-string Hej Peter" interpreting cr type
    Hej Peter ok


    It works because rec-string does
    the parsing of the rest of the string. If the parsing is done in >interpreting the string would have to be after that!

    The recommendation in the proposals is that recognizers don't do
    additional parsing (in order to be callable by, e.g., LOCATE or other
    tools), and that the translator should do the parsing.

    Yes I think that would be a better solution. But it has its problems.

    today I have

    \ Recognizer for "text strings"

    : adj ( len -- len' ) \ if we are at the end of the parse area
    \ we need to adjust what we step back
    source nip >in @ = + ;

    ' noop \ interpret action
    :noname postpone sliteral ; \ compile action
    :noname postpone sliteral postpone 2lit, ; \ postpone action
    translator: translate-string

    : rec-string ( addr len --xi | 0)
    swap c@ '"' =
    if adj negate >in +! (s\") translate-string
    else drop 0 then ;

    I can easily move (s\") ( the interpretive part of S\")
    into the translator.
    But if I move also adj negate >in +! I need to send the len also.
    That is not as clean as doing the adjust in rec-string.

    Honestly I think using the string recognizer outside of recognizing
    is difficult.

    BR
    Peter


    - anton


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Tue Sep 23 22:38:07 2025
    From Newsgroup: comp.lang.forth

    Am 23.09.2025 um 19:23 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?


    To be fair, here MinForth displays the matrix/vector stack in the
    QUIT prompt:

    MinForth 3.6 (64 bit) (fp matrix)
    # 0 0 matrix mat ok
    # m[ 1 2 3 ; 4 5 6 ] ok
    M: <1>
    [ 1 2 3 ; 4 5 6 ]
    # to mat ok
    # mat' ok
    M: <1>
    [ 1 4 ; 2 5 ; 3 6 ]
    # m[ 1 2 3 ; 4 5 6 ; 4 3 -2 ] := mat ok
    M: <1>
    [ 1 4 ; 2 5 ; 3 6 ]
    # mat~ ok
    M: <2>
    [ -2.333333 1.083333 -0.25 ; 2.666667 -1.166667 0.5 ; -0.6666667
    0.4166667 -0.25 ]
    [ 1 4 ; 2 5 ; 3 6 ]
    # m.
    [ -2.33333 1.08333 -0.25
    2.66667 -1.16667 0.5
    -0.666667 0.416667 -0.25 ] ok
    M: <1>
    [ 1 4 ; 2 5 ; 3 6 ]
    #

    The implementation won't help you at all because I use my own
    recognisers. Tt's an eye sore ;-)

    \ ------ Matrix/Vector Indexing ------

    \ Syntax: <name>( for positional indexing
    D: _[MVAL] i" mfmx=(mfMx*)mfpop(), mfmd=mfpop();" ;
    D: _MVAL [,] depth i" mfpush(mfmx);" [,] literal [,] _[mval] ;

    \ Syntax: <name>^ for heap addressing
    : _MVAL^ i" mfpush(mfmx->dat);" ;
    : _[MVAL^] _mval^ [,] literal ;

    \ Syntax: <name>' for implicit transposing
    : _MVAL' i" mfmup, mfm_set(mfmtos,mfmx), mfm_trans(mfmtos);" ;
    : _[MVAL'] _mval [,] _mval' ;

    \ Syntax: <name>~ for implicit inversion
    : _MVAL~ i" mfmup, mfm_set(mfmtos,mfmx), mfm_inv(mfmtos);" ;
    : _[MVAL`] _mval [,] _mval~ ;

    : __MXLITERAL? \ ( -- .. t | f ) recognize different matrix calls
    _parsed 2@ 1- _find-word IF dup cell+ @ _vmxmethods = IF
    C mfmx=(mfMx*)((mfpop())+2*MFSIZE), mfmd=mfsp-mfstk;
    _parsed 2@ 1- + c@
    dup '(' = IF drop ['] noop ['] _mval true EXIT THEN
    dup '^' = IF drop ['] _mval^ ['] _[mval^] true EXIT THEN
    dup ''' = IF drop ['] _mval' ['] _[mval'] true EXIT THEN
    '~' = IF ['] _mval~ ['] _[mval~] true EXIT THEN
    ELSE drop THEN THEN deferred _literal? ;
    IS _LITERAL?

    \ ------

    _LITERAL? is a deferrable element in the recognizer chain which
    is called in the INTERPRET loop (it is also used to recognize
    floats and complex numbers).
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Wed Sep 24 00:41:24 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep23.190034@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep22.103934@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote: >>>albert@spenarnc.xs4all.nl writes:

    So defining a prefix "0r" shadows all earlier-defined words starting
    with "0r"? One has to choose the prefixes well, but the same is true
    for what recognizers should recognize.

    A more realistic example is $ for hex numbers. A sensible implementation
    is to define that in the minimum search order, such that words added to
    Forth itself can start with $ e.g. $= to compare two strings.

    <SNIP>


    However, LITERAL is a standard word that a conforming implementation
    cannot implement in a state-smart way.

    : lit, postpone literal ;
    : foo [ 1 lit, ] ;
    foo . \ 1

    This shows me how to Lift this defect. Rename LITERAL to (LIT) and
    define
    : LITERAL 'LIT , , ; IMMEDIATE
    Then the above test succeeds.
    The interpretation syntax of LITERAL is undefined.
    LIT, is a sneaky way to have an interpretation syntax.
    Normal is
    : foo [ 1 ] LITERAL ;

    In the standard:
    LITERAL :
    Interpretation: Interpretation syntax for this word is undefined.

    What if the standard says
    execution of this word while in interpret mode is an ambiguous condition

    then I would gladly throw an exception if anybody tries it and the examples wouldn't fly.


    (Gforth, iForth, SwiftForth64, and VFX64 process this example correctly).

    To end the controversy, maybe I have to admit I have smart numbers,
    but I manage to be ISO-94 compliant.

    In view of the above, not yet.


    - anton
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Wed Sep 24 01:15:44 2025
    From Newsgroup: comp.lang.forth

    In article <20250923222552.000054c6@tin.it>,
    peter <peter.noreply@tin.it> wrote:
    On Tue, 23 Sep 2025 17:25:18 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    This is the closest I have come

    s\" \"" rec-string Hej Peter" interpreting cr type
    Hej Peter ok

    I still gives me an extra space that is needed to separate rec-string
    from the continuation of the string.

    Yes, if you want to avoid that space, you lose a lot of the
    interactive testing benefit.

    But at least that works on lxf64.

    If I do like this it works
    s\" \"" drop 0 rec-string Hej Peter" interpreting cr type
    Hej Peter ok


    It works because rec-string does
    the parsing of the rest of the string. If the parsing is done in
    interpreting the string would have to be after that!

    The recommendation in the proposals is that recognizers don't do
    additional parsing (in order to be callable by, e.g., LOCATE or other
    tools), and that the translator should do the parsing.

    Yes I think that would be a better solution. But it has its problems.

    today I have

    \ Recognizer for "text strings"

    : adj ( len -- len' ) \ if we are at the end of the parse area
    \ we need to adjust what we step back
    source nip >in @ = + ;

    ' noop \ interpret action
    :noname postpone sliteral ; \ compile action
    :noname postpone sliteral postpone 2lit, ; \ postpone action
    translator: translate-string

    : rec-string ( addr len --xi | 0)
    swap c@ '"' =
    if adj negate >in +! (s\") translate-string
    else drop 0 then ;

    I can easily move (s\") ( the interpretive part of S\")
    into the translator.
    But if I move also adj negate >in +! I need to send the len also.
    That is not as clean as doing the adjust in rec-string.

    Honestly I think using the string recognizer outside of recognizing
    is difficult.

    If you want to have quotes in a string
    using escapes in strings is cumbersome.
    Once you have escapes much caution is needed.
    I adopt the convention that a double quote in a string has to
    be doubled (aka algol 68 and other languages.)

    Your example then becomes
    s" "" rec-string Hej Peter" interpreting cr type
    (in my book
    """ rec-string Hej Peter" interpreting cr type )

    The implementation as a prefix " that does work in both
    interpretation and compilation mode is :

    : " 'SKIP , HERE >R 0 ,
    BEGIN &" PARSE PP@@ DUP &" = WHILE 2DROP 1+ DUP ALLOT R@ $+! REPEAT
    ?BLANK 0= 10 ?ERROR DROP DUP ALLOT R@ $+! ALIGN
    R> $@ POSTPONE DLITERAL ; IMMEDIATE PREFIX

    PP@@ parses a single character. SKIP (AHEAD) allows to have
    it in a definition. I burn dictionary space for strings, even
    in interpret mode.
    The repeat loop immediately ends unless the next character is
    a double quote. It should be blank, or else an exception 10
    is thrown.
    String operators $! $@ $+! are assumed.
    DLITERAL is state-smart and is frowned upon.

    The PREFIX is the only hook to the system, it sets a flag in the
    header of " .


    peter

    Groetjes Albert

    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From dxf@dxforth@gmail.com to comp.lang.forth on Wed Sep 24 13:57:01 2025
    From Newsgroup: comp.lang.forth

    On 24/09/2025 8:41 am, albert@spenarnc.xs4all.nl wrote:
    In article <2025Sep23.190034@mips.complang.tuwien.ac.at>,
    ...

    However, LITERAL is a standard word that a conforming implementation
    cannot implement in a state-smart way.

    : lit, postpone literal ;
    : foo [ 1 lit, ] ;
    foo . \ 1

    This shows me how to Lift this defect. Rename LITERAL to (LIT) and
    define
    : LITERAL 'LIT , , ; IMMEDIATE
    Then the above test succeeds.
    The interpretation syntax of LITERAL is undefined.
    LIT, is a sneaky way to have an interpretation syntax.
    Normal is
    : foo [ 1 ] LITERAL ;

    In the standard:
    LITERAL :
    Interpretation: Interpretation syntax for this word is undefined.

    What if the standard says
    execution of this word while in interpret mode is an ambiguous condition

    then I would gladly throw an exception if anybody tries it and the examples wouldn't fly.

    Agreed. But the loophole frequently argued by parties since Forth-94 is that it was a 'minimum specification' supported by 'ambiguous conditions'. The latter ought not be seen as eternal damnation, rather the potential for more heavenly rewards.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Wed Sep 24 06:26:10 2025
    From Newsgroup: comp.lang.forth

    peter <peter.noreply@tin.it> writes:
    On Tue, 23 Sep 2025 17:25:18 GMT
    anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

    peter <peter.noreply@tin.it> writes:
    This is the closest I have come

    s\" \"" rec-string Hej Peter" interpreting cr type
    Hej Peter ok

    I still gives me an extra space that is needed to separate rec-string
    from the continuation of the string.

    Yes, if you want to avoid that space, you lose a lot of the
    interactive testing benefit.

    But at least that works on lxf64.

    If I do like this it works
    s\" \"" drop 0 rec-string Hej Peter" interpreting cr type
    Hej Peter ok

    Yes, whereas my similar attempts with Gforth failed. I have now
    managed to create a version that works:

    :noname parse-name rec-string interpreting ; execute "Hej Peter" cr type
    \ Hej Peter ok

    But having to go to these lengths is not something that we should be
    proud of.

    Honestly I think using the string recognizer outside of recognizing
    is difficult.

    In Gforth, the recognizer in isolation works:

    s\" \"" rec-string scan-translate-string = cr . type
    -1 " ok

    It's only when you then use INTERPRETING (i.e., perform the
    translation action), that the difficulties appear. However, I think
    that this is mainly the result of taking a shortcut in the
    implementation, not a fundamental problem.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Wed Sep 24 06:38:26 2025
    From Newsgroup: comp.lang.forth

    minforth <minforth@gmx.net> writes:
    Am 23.09.2025 um 19:23 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?


    To be fair, here MinForth displays the matrix/vector stack in the
    QUIT prompt:

    MinForth 3.6 (64 bit) (fp matrix)
    # 0 0 matrix mat ok
    # m[ 1 2 3 ; 4 5 6 ] ok

    Given this syntax, a parsing word M[ suggests itself to me (although I generally dislike parsing words and probably would choose a different
    syntax); or maybe a word that switches to a matrix interpreter
    (possibly implemented using the recognizer words, with ] switching
    back. Why did you choose to use a recognizer?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Wed Sep 24 06:45:35 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    This shows me how to Lift this defect. Rename LITERAL to (LIT) and
    define
    : LITERAL 'LIT , , ; IMMEDIATE

    Looks good.

    In the standard:
    LITERAL :
    Interpretation: Interpretation syntax for this word is undefined.

    Has ISO changed the text? Forth-94 and Forth-2012 say:

    |Interpretation:
    |Interpretation semantics for this word are undefined.

    What if the standard says
    execution of this word while in interpret mode is an ambiguous condition

    It does not, and that's a good thing.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Wed Sep 24 09:39:49 2025
    From Newsgroup: comp.lang.forth

    Am 24.09.2025 um 08:38 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    Am 23.09.2025 um 19:23 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?


    To be fair, here MinForth displays the matrix/vector stack in the
    QUIT prompt:

    MinForth 3.6 (64 bit) (fp matrix)
    # 0 0 matrix mat ok
    # m[ 1 2 3 ; 4 5 6 ] ok

    Given this syntax, a parsing word M[ suggests itself to me (although I generally dislike parsing words and probably would choose a different syntax); or maybe a word that switches to a matrix interpreter
    (possibly implemented using the recognizer words, with ] switching
    back. Why did you choose to use a recognizer?

    I use MARKER to unload libraries when they have served their purpose.
    MARKER also resets the recognizer chain.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Wed Sep 24 10:28:39 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep24.083826@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    minforth <minforth@gmx.net> writes:
    Am 23.09.2025 um 19:23 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?


    To be fair, here MinForth displays the matrix/vector stack in the
    QUIT prompt:

    MinForth 3.6 (64 bit) (fp matrix)
    # 0 0 matrix mat ok
    # m[ 1 2 3 ; 4 5 6 ] ok

    Given this syntax, a parsing word M[ suggests itself to me (although I >generally dislike parsing words and probably would choose a different >syntax); or maybe a word that switches to a matrix interpreter
    (possibly implemented using the recognizer words, with ] switching
    back. Why did you choose to use a recognizer?

    WORDLIST suggest a different solution with a wordlist MATRIX
    MAT( adds MATRIX to the search order
    )MAT removes MATRIX from the search order

    Circumstances may prevent this, but I think that is the situation
    where wordlists are intended for, create a different interpretation/compile environment.


    - anton

    Groetjes Albert
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Wed Sep 24 10:36:29 2025
    From Newsgroup: comp.lang.forth

    Am 24.09.2025 um 08:38 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    Am 23.09.2025 um 19:23 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?


    To be fair, here MinForth displays the matrix/vector stack in the
    QUIT prompt:

    MinForth 3.6 (64 bit) (fp matrix)
    # 0 0 matrix mat ok
    # m[ 1 2 3 ; 4 5 6 ] ok

    Given this syntax, a parsing word M[ suggests itself to me (although I generally dislike parsing words and probably would choose a different syntax); or maybe a word that switches to a matrix interpreter
    (possibly implemented using the recognizer words, with ] switching
    back. Why did you choose to use a recognizer?

    M[ ... pushes a matrix literal onto the matrix stack.
    MATRIX (or VECTOR) define a persistent matrix/vector value.

    M[ is not really parsing, it just sets a flag for the forth
    interpreter. You could write
    M[ 1 fdup ] instead of M[ 1 1 ] or M[ 1. 1. ] or M[ 1e 1e ]

    IOW M[ does not use a recognizer.



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From minforth@minforth@gmx.net to comp.lang.forth on Wed Sep 24 10:44:15 2025
    From Newsgroup: comp.lang.forth

    Am 24.09.2025 um 10:28 schrieb albert@spenarnc.xs4all.nl:
    In article <2025Sep24.083826@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    minforth <minforth@gmx.net> writes:
    Am 23.09.2025 um 19:23 schrieb Anton Ertl:
    minforth <minforth@gmx.net> writes:
    FWIW I also use suffixes for recognizers:
    let M be a matrix
    M-| auto-transposed
    M~ auto-inverted

    Can you give an example of a matrix with your matrix recognizer?


    To be fair, here MinForth displays the matrix/vector stack in the
    QUIT prompt:

    MinForth 3.6 (64 bit) (fp matrix)
    # 0 0 matrix mat ok
    # m[ 1 2 3 ; 4 5 6 ] ok

    Given this syntax, a parsing word M[ suggests itself to me (although I
    generally dislike parsing words and probably would choose a different
    syntax); or maybe a word that switches to a matrix interpreter
    (possibly implemented using the recognizer words, with ] switching
    back. Why did you choose to use a recognizer?

    WORDLIST suggest a different solution with a wordlist MATRIX
    MAT( adds MATRIX to the search order
    )MAT removes MATRIX from the search order

    Circumstances may prevent this, but I think that is the situation
    where wordlists are intended for, create a different interpretation/compile environment.
    In principle yes, but wordlists don't hook themselves into the Forth interpreter. IMO this is the only novelty of recognizers.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From dxf@dxforth@gmail.com to comp.lang.forth on Thu Sep 25 15:22:25 2025
    From Newsgroup: comp.lang.forth

    On 24/09/2025 4:45 pm, Anton Ertl wrote:
    albert@spenarnc.xs4all.nl writes:
    This shows me how to Lift this defect. Rename LITERAL to (LIT) and
    define
    : LITERAL 'LIT , , ; IMMEDIATE

    Looks good.

    In the standard:
    LITERAL :
    Interpretation: Interpretation syntax for this word is undefined.

    Has ISO changed the text? Forth-94 and Forth-2012 say:

    |Interpretation:
    |Interpretation semantics for this word are undefined.

    What if the standard says
    execution of this word while in interpret mode is an ambiguous condition

    It does not, and that's a good thing.

    "ambiguous condition:
    A circumstance for which this Standard does not prescribe a specific behavior for Forth systems and programs."

    "undefined" and "not prescribe a specific behavior" seem much alike to me. Either way, the Standard is saying don't do this thing. It's not as if
    they'd said nothing about it and left it up to you.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Thu Sep 25 06:36:40 2025
    From Newsgroup: comp.lang.forth

    dxf <dxforth@gmail.com> writes:
    "ambiguous condition:
    A circumstance for which this Standard does not prescribe a specific behavior >for Forth systems and programs."

    "undefined" and "not prescribe a specific behavior" seem much alike to me. >Either way, the Standard is saying don't do this thing.

    Not really. The standard just does not specify what happens in this
    case.

    It's not as if
    they'd said nothing about it and left it up to you.

    It's exactly that.

    As a programmer, if you know how the systems you are interested in
    behave, you can make use of that knowledge; the program will then not
    conform to the current standard, but still work as intended on these
    systems. In a standard based on common practice, that's the only
    way to achieve progress.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Thu Sep 25 13:00:08 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep25.083640@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    dxf <dxforth@gmail.com> writes:
    "ambiguous condition:
    A circumstance for which this Standard does not prescribe a specific behavior >>for Forth systems and programs."

    "undefined" and "not prescribe a specific behavior" seem much alike to me. >>Either way, the Standard is saying don't do this thing.

    Not really. The standard just does not specify what happens in this
    case.

    It's not as if
    they'd said nothing about it and left it up to you.

    It's exactly that.

    As a programmer, if you know how the systems you are interested in
    behave, you can make use of that knowledge; the program will then not
    conform to the current standard, but still work as intended on these
    systems. In a standard based on common practice, that's the only
    way to achieve progress.

    That is for unsafe languages like Forth or Fortran. For Algol / Pascal
    program' s behave the same for all systems, except for restrictions
    due to the program environment like "memory exhausted" " too many
    nesting levels", " floating point overflow" for the language model
    is based on infinite resources.
    The language definition is nailed down from day one and there is no
    ambiguous holes to be filled.

    Floating point is a slight exception. Programs can give
    different answers due to precision.

    - anton
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Ruvim@ruvim.pinka@gmail.com to comp.lang.forth on Fri Sep 26 02:19:36 2025
    From Newsgroup: comp.lang.forth

    On 2025-09-17 20:53, Anton Ertl wrote:
    This posting is a more general reflection about designing types in
    Forth; it just uses recognizers as example.

    The original proposal for recognizers had R:FAIL as the result of a recognizer that did not recognize the input. Later that was renamed
    to NOTFOUND; then there was a proposal where 0 would be used instead,
    and Bernd Paysan changed all the uses of NOTFOUND in Gforth to 0.
    Finally, on last Thursday the committee decided to go with
    TRANSLATE-NONE for that result.

    Bernd Paysan thought that it would be easy to change back to a non-0
    value for TRANSLATE-NONE, by looking at the patch that changed
    NOTFOUND to 0. However, in the meantime there has been more work
    done, so it's not so easy.

    E.g., there was a word

    ?FOUND ( x -- x )

    that would throw -13 if x=0. This word was used both with the result
    of recognizers and with nt|0 or xt|0. Fortunately, in this case the
    cases were easy to recognize, and they are now addressed by two words: ?REC-FOUND (for recognizer results) and ?FOUND (for x|0).

    A better name than `?rec-found` is `?recognized`.

    Given the pattern "rec-something ( sd -- qt|0 )", the pattern
    "?rec-something ( sd -- qt )" should be for words that accept a string
    and throw an exception if it is not recognized as "something".



    What do we learn from this? Merging two previously separate types
    such that they are dealt with (partly) the same words (e.g., 0= in
    this case) is easy, as is mixing two kinds of sand. Separating two previously (partly) merged types to use type-specific words is a lot
    more work.

    Yes. But this work is not justified in any way.


    I see the problem a little differently rCo in terms of subtyping and type hierarchies.

    If a type B is a subtype of a type A, than all words that accept any
    member of A, also accept any member of B.

    So when introducing a new type C, the first challenge is to optimally
    choose the nearest supertype (or supertypes) for it.

    For example, if you make C a subtype of A, than all methods of A apply
    to C. If you make C a subtype of B, all methods of A and B apply to C.

    When choosing a supertype, the factors for consideration are:
    - consistency with existing types and methods;
    - minimizing the lexical code size of programs;
    - applying existing techniques and methods to the new types;
    - restrictions on implementations;

    We generally don't plan for future changes to subtype relationships.
    Yes, they can be changed during the design and experimentation phase,
    but that doesn't constitute an argument for choosing one supertype over another.

    Obviously, the more general a supertype is, the more implementation
    options are available and the fewer existing methods can be applied to
    members of the type. However, this dependence alone is also not an
    argument for choosing one supertype over another.





    Returning to recognizers.

    There is a quite general type: ( i*x x\0 ). Let's call it "any-nz".

    The unique feature of this type is that there is a simple and general
    method to check whether a data object is a member of this type rCo just
    check whether the top single-cell value is a non-zero. And this method
    applies to *any* subtype of this type. This method is made even more
    elegant by the fact that control flow operators apply it automatically.

    Note that nt, xt, wid are subtypes of any-nz.

    Another side of any-nz is that a union type ( any-nz | 0 ) is a natively discriminated union. This has led to a common approach of returning
    any-nz on success and 0 on failure.

    The question is: should the recognizers follow this approach? I think
    so. This effectively means that a type of a success result of a
    recognizer is a subtype of any-nz, and a type of a failure result is a
    subtype of the unit type "0".


    The only counterargument is that 0 on failure is too restrictive for implementations.

    This does not seem convincing. Because `search-wordlist`, `find`,
    `find-name`, `find-name-in` return 0 on failure and this is not too restrictive for implementations.

    OTOH, why in this case we should prefer the convenience of
    implementations over the convenience of programs?






    You can fake it by defining 0 CONSTANT TRANSLATE-NONE, but then you
    never know if your code ports to other systems where TRANSLATE-NONE is non-zero. For now Gforth does it this way, but I don't expect that to
    be the final stage.

    Should we prefer to separate types or merge them?


    In other words, should we restrict implementation options in this
    regard? Yes, because this is a common approach, which makes programs
    simpler.



    Both approaches have advantages:

    * With separate words for dealing with the types, we can easily find
    all uses of that type and do something about it. E.g., a while ago
    I changed the cs-item (control-flow stack item) in Gforth from three
    to four cells. This was relatively easy because there are only a
    few words in Gforth that deal with cs items.


    The cs-item example does not demonstrate any advantages because the
    formal type didn't change. You only needed to find the places where the system-specific subtype was used by system-specific methods. Places
    where the formal type was used didn't change.




    * With a merged approach, we can use the same words for dealing with
    several types, with further words building upon these words (instead
    of having to define the further words n times for n types). But
    that makes the separation problem even harder.

    A separation (i.e., breaking a subtyping relationship) should not be
    planned at all.


    Overall, I think that the merged approach is preferable, but only if
    you are sure that you will never need to separate the types (whether
    due to a committee decision or because some new requirement means that
    you have to change the representation of the type).


    If an old data type will not fit the new requirements in the future, the
    new type (and new methods) should be introduced. Changing existing
    subtypes of an old type cannot be planned in principle.




    --
    Ruvim

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From dxf@dxforth@gmail.com to comp.lang.forth on Fri Sep 26 10:56:07 2025
    From Newsgroup: comp.lang.forth

    On 25/09/2025 4:36 pm, Anton Ertl wrote:
    dxf <dxforth@gmail.com> writes:
    "ambiguous condition:
    A circumstance for which this Standard does not prescribe a specific behavior
    for Forth systems and programs."

    "undefined" and "not prescribe a specific behavior" seem much alike to me. >> Either way, the Standard is saying don't do this thing.

    Not really. The standard just does not specify what happens in this
    case.

    It's not as if
    they'd said nothing about it and left it up to you.

    It's exactly that.

    Hardly:

    -14 [THROW] interpreting a compile-only word

    As a programmer, if you know how the systems you are interested in
    behave, you can make use of that knowledge; the program will then not
    conform to the current standard, but still work as intended on these
    systems. In a standard based on common practice, that's the only
    way to achieve progress.

    AFAICS the standard is a document of agreed practice - not 'you do your
    thing and I'll do mine'. But perhaps you were never up for that? I
    can't say I was - not that I knew it at the time. When one grows up in
    an environment that peddles a certain idea, it can be sometime before
    one realizes that's all it is.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Hans Bezemer@the.beez.speaks@gmail.com to comp.lang.forth on Fri Sep 26 17:09:55 2025
    From Newsgroup: comp.lang.forth

    On 25-09-2025 07:22, dxf wrote:

    4.1.1 Implementation-defined options
    The implementation-defined items in the following list represent characteristics and choices left to the discretion of the implementor, provided that the requirements of this Standard are met.

    (HB) This is an extensional definition - it lists all elements in the
    set of "implementation defined options".

    4.1.2 Ambiguous conditions
    A system shall document the system action taken upon each of the general
    or specific ambiguous conditions identified in this Standard.

    (HB) This is an extensional definition - it lists all elements in the
    set of "ambiguous conditions".

    The fun part is that in ANS-Forth "undefined" is actually undefined.
    Using Merrian Webster: *not provided with a definition*

    "not prescribe a specific behavior" equates "ambiguous condition" (2.1):
    a circumstance for which this Standard does not prescribe a specific
    behavior for Forth systems and programs. It's not related to "undefined".

    If you invoke an "ambiguous condition" as a programmer, you cannot
    depend on any standard behavior- because there is none.

    If you have to tackle an "ambiguous condition" as an implementer, the
    standard describes in 3.4.4 (Possible actions on an ambiguous condition)
    which action you can take. This is a limited list - so it's not
    "anything you wanna do".

    E.g. "no loop parameters" is an ambiguous condition for Forth's +LOOP -
    and has to be tackled according to section 3.4.4. However +LOOP in interpretation mode is undefined. Since "undefined" is undefined, the
    possible actions that can be taken are also undefined. If you want to interpret that as "anything goes" I won't blame you ;-)

    Show some nice ASCII art. Just a suggestion.

    Hans Bezemer



    On 24/09/2025 4:45 pm, Anton Ertl wrote:
    albert@spenarnc.xs4all.nl writes:
    This shows me how to Lift this defect. Rename LITERAL to (LIT) and
    define
    : LITERAL 'LIT , , ; IMMEDIATE

    Looks good.

    In the standard:
    LITERAL :
    Interpretation: Interpretation syntax for this word is undefined.

    Has ISO changed the text? Forth-94 and Forth-2012 say:

    |Interpretation:
    |Interpretation semantics for this word are undefined.

    What if the standard says
    execution of this word while in interpret mode is an ambiguous condition >>
    It does not, and that's a good thing.

    "ambiguous condition:
    A circumstance for which this Standard does not prescribe a specific behavior for Forth systems and programs."

    "undefined" and "not prescribe a specific behavior" seem much alike to me. Either way, the Standard is saying don't do this thing. It's not as if they'd said nothing about it and left it up to you.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Ruvim@ruvim.pinka@gmail.com to comp.lang.forth on Sat Sep 27 13:43:54 2025
    From Newsgroup: comp.lang.forth

    On 2025-09-26 02:19, Ruvim wrote:
    On 2025-09-17 20:53, Anton Ertl wrote:
    This posting is a more general reflection about designing types in
    Forth; it just uses recognizers as example.

    The original proposal for recognizers had R:FAIL as the result of a
    recognizer that did not recognize the input.-a Later that was renamed
    to NOTFOUND; then there was a proposal where 0 would be used instead,
    and Bernd Paysan changed all the uses of NOTFOUND in Gforth to 0.
    Finally, on last Thursday the committee decided to go with
    TRANSLATE-NONE for that result.

    [...]


    There is a quite general type: ( i*x x\0 ). Let's call it "any-nz".

    The unique feature of this type is that there is a simple and general
    method to check whether a data object is a member of this type rCo just check whether the top single-cell value is a non-zero. And this method applies to *any* subtype of this type.-a This method is made even more elegant by the fact that control flow operators apply it automatically.

    Note that nt, xt, wid are subtypes of any-nz.

    Another side of any-nz is that a union type ( any-nz | 0 ) is a natively discriminated union. This has led to a common approach of returning any-
    nz on success and 0 on failure.

    The question is: should the recognizers follow this approach? I think
    so.-a This effectively means that a type of a success result of a
    recognizer is a subtype of any-nz, and a type of a failure result is a subtype of the unit type "0".


    This also makes recognizers more convenient to use in programs.

    For example:

    "foo" rec-name if ( nt ) ... else ( ) ... then

    buf accepted ( sd.string ) trim
    rec-number-float if ( F: r ) ...
    else ." try again" cr recurse then

    Why burden users with the word `translate-none` in all such cases?
    (instead of burden systems implementors with special-casing zero in a
    few cases only).



    Worse, in the latest proposal this word `translate-none` doesn't do what
    its name implies rCo it does not perform any translation.

    In the previous proposal (2024-12-15) translate-* words perform
    translation. In the latest proposal (2025-09-12) translate-* words are constants (i.e. words that return some value, the same each time they
    are executed).

    According to the traditional naming convention, the names of constants
    should be nouns (or noun phrases), not verbs.

    This is as if `read-file` returned a reading mode (a constant) instead
    of reading data from a file.




    The only counterargument is that 0 on failure is too restrictive for implementations.

    This does not seem convincing. Because `search-wordlist`, `find`, `find- name`, `find-name-in` return 0 on failure and this is not too
    restrictive for implementations.

    OTOH, why in this case we should prefer the convenience of
    implementations over the convenience of programs?



    --
    Ruvim

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From dxf@dxforth@gmail.com to comp.lang.forth on Mon Sep 29 00:42:10 2025
    From Newsgroup: comp.lang.forth

    On 27/09/2025 1:09 am, Hans Bezemer wrote:
    ...
    Show some nice ASCII art. Just a suggestion.

    Because Standard Forth is crazy-making? I just finished an Intel-HEX
    file reformatter to work-around someone else's poor decision. That at
    least was do-able.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Mon Sep 29 05:54:02 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep25.083640@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    As a programmer, if you know how the systems you are interested in
    behave, you can make use of that knowledge; the program will then not >>conform to the current standard, but still work as intended on these >>systems. In a standard based on common practice, that's the only
    way to achieve progress.

    That is for unsafe languages like Forth or Fortran. For Algol / Pascal >program' s behave the same for all systems, except for restrictions
    due to the program environment like "memory exhausted" " too many
    nesting levels", " floating point overflow" for the language model
    is based on infinite resources.
    The language definition is nailed down from day one and there is no
    ambiguous holes to be filled.

    Nice fantasy.

    For Algol 60 (not sure about Algol 68), they could not even agree on a machine-readable representation of the programs. I.e., you cannot
    write a file containing any Algol 60 program that is guaranteed to be
    compiled by all Algol 60 compilers; the behaviour of the program is
    only when you have compiled it and can run it.

    In Pascal, the program can access a pointer after DELETEing its
    contants, and you can also DELETE the pointer several times, all not
    defined by the language and typically resulting in programs behaving
    other than intended. The same kinds of execution sequences are often
    mentioned as vulnerabilities in C programs (use after free, double
    free); this only is not reported widely for Pascal programs because
    there are no Pascal programs in wide use.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From albert@albert@spenarnc.xs4all.nl to comp.lang.forth on Mon Sep 29 11:03:25 2025
    From Newsgroup: comp.lang.forth

    In article <2025Sep29.075402@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep25.083640@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    As a programmer, if you know how the systems you are interested in >>>behave, you can make use of that knowledge; the program will then not >>>conform to the current standard, but still work as intended on these >>>systems. In a standard based on common practice, that's the only
    way to achieve progress.

    That is for unsafe languages like Forth or Fortran. For Algol / Pascal >>program' s behave the same for all systems, except for restrictions
    due to the program environment like "memory exhausted" " too many
    nesting levels", " floating point overflow" for the language model
    is based on infinite resources.
    The language definition is nailed down from day one and there is no >>ambiguous holes to be filled.

    Nice fantasy.

    For Algol 60 (not sure about Algol 68), they could not even agree on a >machine-readable representation of the programs. I.e., you cannot
    write a file containing any Algol 60 program that is guaranteed to be >compiled by all Algol 60 compilers; the behaviour of the program is
    only when you have compiled it and can run it.

    The character set wherein the program is represented is irrelevant.
    I cannot compile a EBCDIC FORTRAN program in my linux system.

    For Algol 68 there was a small hole in the original report
    specification that led to the "revised report".


    In Pascal, the program can access a pointer after DELETEing its
    contants, and you can also DELETE the pointer several times, all not
    defined by the language and typically resulting in programs behaving
    other than intended. The same kinds of execution sequences are often >mentioned as vulnerabilities in C programs (use after free, double
    free); this only is not reported widely for Pascal programs because
    there are no Pascal programs in wide use.

    That was an oversight, not intended.


    - anton
    --
    The Chinese government is satisfied with its military superiority over USA.
    The next 5 year plan has as primary goal to advance life expectancy
    over 80 years, like Western Europe.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.forth on Mon Sep 29 16:27:53 2025
    From Newsgroup: comp.lang.forth

    albert@spenarnc.xs4all.nl writes:
    In article <2025Sep29.075402@mips.complang.tuwien.ac.at>,
    For Algol 60 (not sure about Algol 68), they could not even agree on a >>machine-readable representation of the programs. I.e., you cannot
    write a file containing any Algol 60 program that is guaranteed to be >>compiled by all Algol 60 compilers; the behaviour of the program is
    only when you have compiled it and can run it.

    The character set wherein the program is represented is irrelevant.
    I cannot compile a EBCDIC FORTRAN program in my linux system.

    EBCDIC did not exist when Fortran was designed and released, so your
    example demnstrates that different encodings are not the problem. If
    you get a Fortran program encoded in EBCDIC, you can convert it to
    ASCII with a command like "recode ebcdic..ascii". The important thing
    is that you can then compile this program with a Fortran compiler on
    Linux.

    By contrast, even if there was an Algol 60 compiler on Linux, and if
    the enoding used by that compiler is the same as that of the source
    programs you have available (do you have any?), the Algol 60
    specification would not guarantee that you can compile it with that
    compiler, because the specification does not specify the
    machine-readable representation of programs at all.

    In Pascal, the program can access a pointer after DELETEing its
    contants, and you can also DELETE the pointer several times, all not >>defined by the language and typically resulting in programs behaving
    other than intended. The same kinds of execution sequences are often >>mentioned as vulnerabilities in C programs (use after free, double
    free); this only is not reported widely for Pascal programs because
    there are no Pascal programs in wide use.

    That was an oversight, not intended.

    Whether intended, oversight, or something else, this property of
    Pascal is counterexample for your claim.

    Prolog does not have this problem of Pascal, and yet it's standard
    does not specify everything; it even has undefined behaviour.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html
    EuroForth 2025 registration: https://euro.theforth.net/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Hans Bezemer@the.beez.speaks@gmail.com to comp.lang.forth on Tue Sep 30 18:15:42 2025
    From Newsgroup: comp.lang.forth

    On 28-09-2025 16:42, dxf wrote:
    On 27/09/2025 1:09 am, Hans Bezemer wrote:
    ...
    Show some nice ASCII art. Just a suggestion.

    Because Standard Forth is crazy-making? I just finished an Intel-HEX
    file reformatter to work-around someone else's poor decision. That at
    least was do-able.


    If you want to annoy some ANS standard authors, just because you can, do
    it with a bit of pizzazz ;-)

    HB
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From dxf@dxforth@gmail.com to comp.lang.forth on Wed Oct 1 14:10:38 2025
    From Newsgroup: comp.lang.forth

    On 1/10/2025 2:15 am, Hans Bezemer wrote:
    On 28-09-2025 16:42, dxf wrote:
    On 27/09/2025 1:09 am, Hans Bezemer wrote:
    ...
    Show some nice ASCII art. Just a suggestion.

    Because Standard Forth is crazy-making?-a I just finished an Intel-HEX
    file reformatter to work-around someone else's poor decision.-a That at
    least was do-able.


    If you want to annoy some ANS standard authors, just because you can, do it with a bit of pizzazz ;-)

    HB

    Just tell them that they were never really ANS followers. Works every
    time because it's so obvious :-)

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Hans Bezemer@the.beez.speaks@gmail.com to comp.lang.forth on Thu Oct 2 15:48:03 2025
    From Newsgroup: comp.lang.forth

    On 01-10-2025 06:10, dxf wrote:
    On 1/10/2025 2:15 am, Hans Bezemer wrote:
    On 28-09-2025 16:42, dxf wrote:
    On 27/09/2025 1:09 am, Hans Bezemer wrote:
    ...
    Show some nice ASCII art. Just a suggestion.

    Because Standard Forth is crazy-making?-a I just finished an Intel-HEX
    file reformatter to work-around someone else's poor decision.-a That at
    least was do-able.


    If you want to annoy some ANS standard authors, just because you can, do it with a bit of pizzazz ;-)

    HB

    Just tell them that they were never really ANS followers. Works every
    time because it's so obvious :-)


    Never thought of that - but it's so true. You're such a wise man. ;-)

    Hans Bezemer

    --- Synchronet 3.21a-Linux NewsLink 1.2