• Compilers :)

    From Tristan B. Velloza Kildaire@deavmi@redxen.eu to comp.compilers on Mon Jan 2 12:28:12 2023
    From Newsgroup: comp.compilers

    I am currently working on my own compiler for something like C but with
    minimal object orientation support and no features like
    templating/generics etc etc.

    Trying to get a feeler out there for anyone who would be interested in
    using such a language, obviously the project is something I work on in
    my spare time but I have written everything from scratch. I plan to, by
    the end of 2023 hopefully, have a full release out. The code emit is
    already working well and so is the dependency tree algorithmn.

    Also, keen to hear what everyone else is working on here?

    [Not to discourage you or anything, but the chances of people
    switching to yet another C-like language rounds to zero. Only Rust and
    Go have gotten any traction lately, but they both have big companies
    behind them, and they have libraries and build environments. Writing
    your own language is fun and a good way to learn, but it's not likely
    to be of interest to other people. -John]

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Spiros Bousbouras@spibou@gmail.com to comp.compilers on Mon Jan 2 20:52:28 2023
    From Newsgroup: comp.compilers

    On Mon, 2 Jan 2023 12:28:12 +0200
    "Tristan B. Velloza Kildaire" <deavmi@redxen.eu> wrote:
    I am currently working on my own compiler for something like C but with minimal object orientation support and no features like
    templating/generics etc etc.

    Trying to get a feeler out there for anyone who would be interested in
    using such a language, obviously the project is something I work on in
    my spare time but I have written everything from scratch.

    Knowing what you are trying to achieve i.e. why you are creating a new programming language would be useful. On which operating systems is it
    going to work ? What will be the license ?

    I plan to, by
    the end of 2023 hopefully, have a full release out. The code emit is
    already working well and so is the dependency tree algorithmn.

    Code emitter for what targets ?
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Steve Limb@stephenjohnlimb@gmail.com to comp.compilers on Tue Jan 3 16:24:05 2023
    From Newsgroup: comp.compilers

    IrCOm not sure there would be that much demand for a cut down C.

    I too am working on a language - https://www.ek9.io/ . IrCOm not sure that
    will get any traction either!

    But it is technically challenging and interesting grappling with the compromises and technologies involved.

    My main drive has been to experiment with syntax in different forms and see
    how they feel and also
    roll in loads (probably an excessive amount) of much higher level concepts.

    On 2 Jan 2023, at 20:52, Spiros Bousbouras <spibou@gmail.com> wrote:

    On Mon, 2 Jan 2023 12:28:12 +0200
    "Tristan B. Velloza Kildaire" <deavmi@redxen.eu> wrote:
    I am currently working on my own compiler for something like C but with
    minimal object orientation support and no features like
    templating/generics etc etc.

    Trying to get a feeler out there for anyone who would be interested in
    using such a language, obviously the project is something I work on in
    my spare time but I have written everything from scratch.

    Knowing what you are trying to achieve i.e. why you are creating a new programming language would be useful. On which operating systems is it
    going to work ? What will be the license ?

    I plan to, by
    the end of 2023 hopefully, have a full release out. The code emit is
    already working well and so is the dependency tree algorithmn.

    Code emitter for what targets ?
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Tue Jan 3 12:52:06 2023
    From Newsgroup: comp.compilers

    On Tuesday, January 3, 2023 at 9:45:17 AM UTC-8, Steve Limb wrote:
    IrCOm not sure there would be that much demand for a cut down C.

    I was reading it as a cut down C++, or maybe a cut-up C.

    If you think that C++ has too many features, then I can see a reduced
    version. Personally, I think that is why I like Java over C++.

    Well, for one, it is more C-like than C++.

    Otherwise, I believe the source for the original C++ compiler, which
    converted to C for a C compiler, is still around, and could be used as
    a start for a new C++ like language.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From arnold@arnold@skeeve.com (Aharon Robbins) to comp.compilers on Wed Jan 4 17:12:17 2023
    From Newsgroup: comp.compilers

    In article <23-01-004@comp.compilers>, gah4 <gah4@u.washington.edu> wrote: >Otherwise, I believe the source for the original C++ compiler, which >converted to C for a C compiler, is still around, and could be used as
    a start for a new C++ like language.

    This is true that it's around, but I think it has copyright / license limitations that would prevent building something new on top of it.

    Arnold
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com

    [The copy at the computer history museum says "The source code in this
    section is posted with the permission of the copyright owner for
    historical research purposes only." It's from 1997 so I would think
    it's a long way from modern C++. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Wed Jan 4 12:39:10 2023
    From Newsgroup: comp.compilers

    On Wednesday, January 4, 2023 at 10:26:58 AM UTC-8, Aharon Robbins wrote:

    (snip, where I wrote about the original C++ compiler.)

    This is true that it's around, but I think it has copyright / license limitations that would prevent building something new on top of it.

    [The copy at the computer history museum says "The source code in this section is posted with the permission of the copyright owner for
    historical research purposes only." It's from 1997 so I would think
    it's a long way from modern C++. -John]

    It sounds like the OP is, so far, in a research project.

    But also seems to want something like C++, but without all the
    fancy new features. Starting with a compiler without those features,
    seemed like a good way to go.

    (I believe some have suggested, over the years, a C--, though
    exactly which features are removed, I don't know.)

    The process for starting with copyright code, and modifying it
    until all copyright parts are gone, seems to be well known, though
    I suspect never easy.
    [I wouldn't try mutating out the copyrighted stuff. Depending on how aggressive the copyright holder is, they can claim copyright in the
    structure and sequence of the code. And if it's a research project,
    it's not that hard to make a parser and symbol table using compiler
    tools better than 1997 era yacc. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hans-Peter Diettrich@DrDiettrich1@netscape.net to comp.compilers on Thu Jan 5 01:12:46 2023
    From Newsgroup: comp.compilers

    On 1/2/23 11:28 AM, Tristan B. Velloza Kildaire wrote:

    I am currently working on my own compiler for something like C

    Define "like C".

    DoDi
    [Grouped with {} ? Comments with /* */ ? Designed by a guy who
    worked for the phone company? -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From marb...@yahoo.co.uk@marblypup@yahoo.co.uk to comp.compilers on Thu Jan 5 06:27:06 2023
    From Newsgroup: comp.compilers

    On Tuesday, 3 January 2023 at 17:45:17 UTC, Steve Limb wrote:
    IrCOm not sure there would be that much demand for a cut down C.

    I recently read (well, skimmed) http://www.mjbauer.biz/C-less%20Reference%20Manual.pdf
    "A concise subset of the C programming language".
    Though I'm a bit baffled by some of Bauer's choices. Why is
    `char *foo="foo", *bar="bar"; puts(foo); puts(bar);`
    allowed but not
    `char *foo="foo"; puts(foo); char *bar="bar"; puts(bar);`
    ? Admittedly, the latter is only allowed in relatively recent C, but from my (very limited) experience writing compilers, the latter is no harder to compile.
    I idly thought about adding stuff to C-less and calling it C-more-or-less, Cmol, for short.

    I'm up for reading the source of any relatively simple compiler for, and written in, anything C-like. I've tried making sense of the GNU C compiler a few times. My brain may recover one day!
    [If you're doing a one-pass compiler, it's easier if all the declarations are at the
    beginning so you can generate the code to set up the stack frame and do initializations.
    I agree that on modern computers it's not a big deal, but remember that early C compilers
    ran in 24K bytes and I don't mean meagabytes. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Thu Jan 5 16:26:47 2023
    From Newsgroup: comp.compilers

    On Thursday, January 5, 2023 at 6:34:35 AM UTC-8, marb...@yahoo.co.uk wrote:

    (snip)
    I'm up for reading the source of any relatively simple compiler for, and written in, anything C-like. I've tried making sense of the GNU C compiler a few times. My brain may recover one day!

    Some years ago, I bought the LCC book.

    The book explains it in some detail, in addition to any comments
    in the code. As well as I know, it is meant for understanding.

    It is especially convenient if you only what to retarget the code
    generator, for a new (or existing) machine.

    I have no desire to look at gcc.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Fri Jan 6 15:39:05 2023
    From Newsgroup: comp.compilers

    On 05/01/2023 15:27, marb...@yahoo.co.uk wrote:
    On Tuesday, 3 January 2023 at 17:45:17 UTC, Steve Limb wrote:
    IrCOm not sure there would be that much demand for a cut down C.

    I recently read (well, skimmed) http://www.mjbauer.biz/C-less%20Reference%20Manual.pdf
    "A concise subset of the C programming language".
    Though I'm a bit baffled by some of Bauer's choices. Why is
    `char *foo="foo", *bar="bar"; puts(foo); puts(bar);`
    allowed but not
    `char *foo="foo"; puts(foo); char *bar="bar"; puts(bar);`
    ? Admittedly, the latter is only allowed in relatively recent C, but from my (very limited) experience writing compilers, the latter is no harder to compile.

    By "relatively recent C", you mean C99 ? Mixing statements and
    declarations was standardised in C nearly 25 years ago, and was probably implemented at least a decade before that in some compilers. (Some C
    compilers incorporated features from C++ that later became standardised
    in C99.)

    I idly thought about adding stuff to C-less and calling it C-more-or-less, Cmol, for short.

    There have been many "sort-of-C" languages made, as well as
    "sort-of-C++". (Embedded C++ was one attempt at making a simpler subset
    of C++ for use in embedded systems - despite significant backing, it has
    failed completely.)

    One subset of C that was useful is C--. This was aimed at being a code generation target for higher level languages, to improve portability and
    save the high level compiler writers from learning the details of
    assembly language on different targets. I think these days it is more
    common to target LLVM or a common virtual machine (such as JVM) for such purposes.


    The "C-less" language referenced in the link above seems to be targeting
    "a first course in embedded programming". As an embedded programmer by profession, I would not want to hire a new developer that had learned
    "C-less". They would /think/ that they could program in C, but be
    mislead in several areas and have learned a number of bad habits. (I
    don't want to go through them all, but I agree with you that the style
    of "all your declarations at the start of the function" is long
    outdated, and often - but not universally - considered a bad idea.)

    I also find it strange that this is supposedly a description of parts of
    the C language, but regularly uses different terms from the C standards
    and other C documentation for no visible benefit. This could only serve
    to confuse the student.


    I'm up for reading the source of any relatively simple compiler for, and written in, anything C-like. I've tried making sense of the GNU C compiler a few times. My brain may recover one day!

    gcc is the result of thousands of man-hours, spread over about 5 decades
    and thousands of contributors. No, it is not an easy read! I think
    "LCC" is probably the best choice of a simple C compiler to read and
    understand - this was part of the motivation for writing it. "Tiny C
    Compiler" is another option.


    [If you're doing a one-pass compiler, it's easier if all the declarations are at the
    beginning so you can generate the code to set up the stack frame and do initializations.
    I agree that on modern computers it's not a big deal, but remember that early C compilers
    ran in 24K bytes and I don't mean meagabytes. -John]

    If you are writing a compiler for use by people writing code, then being
    able to mix declarations and statements is hugely important - and since generally people /use/ compilers more than they write them, the
    trade-off should be on the side of the effort for the users, rather than
    the compiler writers.

    But if you are writing it for fun, or to learn about compiler writing,
    then of course a simpler language is easier. And if your aim is for an intermediary language generated by a higher level compiler, then a
    simple subset is also convenient.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Sun Jan 8 20:21:10 2023
    From Newsgroup: comp.compilers

    On 07/01/2023 11:14, marb...@yahoo.co.uk wrote:
    [If you're doing a one-pass compiler, it's easier if all the declarations are at the
    beginning so you can generate the code to set up the stack frame and do initializations.
    I agree that on modern computers it's not a big deal, but remember that early C compilers
    ran in 24K bytes and I don't mean meagabytes. -John]

    Presumably such a compiler would have to create 2 stack frames for
    `char *foo="foo"; puts(foo); { char *bar="bar"; puts(bar); }`

    No. The compiler could treat this as though you had written :

    char *foo = "foo";
    char *bar_nested_scope_1;

    puts(foo);
    {
    bar_nested_scope_1 = "bar";
    puts(bar);
    }

    In other words, it can combine all the variables declared in nested
    scope and act as though they were all defined at the start of the
    function. It would have to take care of naming, as nested block scopes
    could have identifiers that shadow outer scope names. And it may or may
    not choose to allow overlapping of variables that have independent
    lifetimes.

    It only gets complicated when you have variable length arrays, which
    would allow the declaration of an array whose size is not known at the
    start of the function. But if you are supporting VLA's, you'll already
    have support for more sophisticated mixes of variables and code.

    [In a mutant version of C with nested scopes, I suppose so, but when C compilers
    ran in 24K bytes, it didn't. -John]

    I don't have my copy of K&R handy, or a pre-K&R Unix C manuals, but I
    expect someone will correct me if I'm wrong :-) As far as I know, the C described in "The C Programming Language" in 1978, when 24 KB was still
    a big deal, supported declarations at the start of any compound
    statement block. That is, nested scopes. It's possible that pre-K&R C compilers were more limited.
    [I actually used that 24K C compiler in about 1975 and I am reasonably sure
    it did not let you put declarations other than in the outer block. There's
    a 1978 edition of K&R at archive.org and by then it did let you put declarations in any block. It's a little harder than what you say because declarations in non-overlapping blocks should overlay each other, e.g.:

    foo() {
    int a:
    ...
    {
    int b[100];
    somefunc(b);
    }
    {
    float c[100];
    otherfunc(c);
    }
    }

    you want b and c to use the same storage. It's not hard, but it's a little more than promoting and renaming. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Hans-Peter Diettrich@DrDiettrich1@netscape.net to comp.compilers on Mon Jan 9 04:48:37 2023
    From Newsgroup: comp.compilers

    On 1/8/23 8:21 PM, David Brown wrote:

    In other words, it can combine all the variables declared in nested
    scope and act as though they were all defined at the start of the
    function.

    AFAIR nested scopes were introduced just to allow for space saving
    memory overlays. Regardless of whether a compiler really takes that optimization *option*.

    Of course problems can arise from malware assuming memory contents as
    left over from a previous block, as it's not required that the compiler initializes all local variables on block entry.

    DoDi
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Mon Jan 9 17:41:51 2023
    From Newsgroup: comp.compilers

    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    don't want to go through them all, but I agree with you that the style
    of "all your declarations at the start of the function" is long
    outdated, and often - but not universally - considered a bad idea.)

    Declarations have never been required to be at the top of a function in
    C, because they can be in any compound statement block. I think
    that goes all the way back to the B language. [Nope, see the next message. -John]

    The "Variables at the top" meme may be something coming from Pascal.
    IIRC, in Pascal, compound statements aren't full blocks; they cannot
    have VAR declarations.

    When programmers abandoned Pascal in the 1980s, they carried over this
    habit into C.

    I hate mixed declarations and code because it's almost sa bad as variables-at-the-top. The scope of a declaration that is just planted
    into the middle of a compound statement block extends all the way to the
    end of the block. There should be a smaller enclosing block which
    exactly delimits the scope of that variable. If some variable is used
    over seven lines of a 300 line function, those seven lines should
    ideally be enclosed in curly braces, so the variable is not known
    outside of those lines. Just planting an unwrapped declaration of the
    variable at the function scope level (outermost block) solves only half
    the problem. The scope of the variable starts close to where the
    variable is used, which is good; but it still goes to the end of the
    function, way past its actual semantic scope that ends at the last use.

    A block like this can be repeated with copy and paste:

    {
    int yes = 1;
    setsockopt(fd, SO_WHATEVER, &yes);
    }

    This cannot: you will get redefinition errors:

    int yes = 1;
    setsockopt(fd, SO_WHATEVER, &yes);

    you have to think about ensuring that "int yes" occurs in one place
    that is before the first use, and the other places assign to it.
    Or invent different names.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.compilers on Mon Jan 9 11:24:07 2023
    From Newsgroup: comp.compilers

    David Brown <david.brown@hesbynett.no> writes:
    [...]
    [In a mutant version of C with nested scopes, I suppose so, but when C compilers
    ran in 24K bytes, it didn't. -John]

    I don't have my copy of K&R handy, or a pre-K&R Unix C manuals, but I
    expect someone will correct me if I'm wrong :-) As far as I know, the C described in "The C Programming Language" in 1978, when 24 KB was still
    a big deal, supported declarations at the start of any compound
    statement block. That is, nested scopes. It's possible that pre-K&R C compilers were more limited.
    [I actually used that 24K C compiler in about 1975 and I am reasonably sure it did not let you put declarations other than in the outer block. There's
    a 1978 edition of K&R at archive.org and by then it did let you put declarations in any block. It's a little harder than what you say because declarations in non-overlapping blocks should overlay each other, e.g.:

    foo() {
    int a:
    ...
    {
    int b[100];
    somefunc(b);
    }
    {
    float c[100];
    otherfunc(c);
    }
    }

    you want b and c to use the same storage. It's not hard, but it's a little more than promoting and renaming. -John]

    The 1975 C reference manual <https://www.bell-labs.com/usr/dmr/www/cman.pdf> allows declarations at the top of a function body but not at the top of
    a compound statement. K&R1 (1978) does allow declarations at the top of
    a compound statement.

    In the example above, you'd certainly *want* b and c to use the same
    storage, but the language doesn't require it. But it's a simple enough optimization that I'd expect all compilers to do it (or something more sophisticated). Also, a compiler could either generate code that
    allocates storage for each block on entry to the block, or that
    allocates the maximum size on entry to the function. I haven't bothered
    to look into what compilers actually do. (And it gets more complicated
    if you introduce variable-length arrays.)

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    [On a PDP-11 or Vax, adjusting the size of local storage was a single instruction to adjust the stack pointer at the entry to or exit from
    each block. In the example above, imagine that between the two blocks
    is a call to bloatfunc() which has a large stack frame and you can see
    why it would have been worth it. I will admit that if you had goto
    statements, that made it considerably messier. -John]

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Tue Jan 10 17:48:18 2023
    From Newsgroup: comp.compilers

    On 09/01/2023 18:41, Kaz Kylheku wrote:
    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    don't want to go through them all, but I agree with you that the style
    of "all your declarations at the start of the function" is long
    outdated, and often - but not universally - considered a bad idea.)

    Declarations have never been required to be at the top of a function in
    C, because they can be in any compound statement block. I think
    that goes all the way back to the B language. [Nope, see the next message. -John]

    The "Variables at the top" meme may be something coming from Pascal.
    IIRC, in Pascal, compound statements aren't full blocks; they cannot
    have VAR declarations.

    I suspect it is much older than that - in assembly programming, you do
    not normally mix data and code sections.

    When programmers abandoned Pascal in the 1980s, they carried over this
    habit into C.

    I hate mixed declarations and code because it's almost as bad as variables-at-the-top. The scope of a declaration that is just planted
    into the middle of a compound statement block extends all the way to the
    end of the block. There should be a smaller enclosing block which
    exactly delimits the scope of that variable. If some variable is used
    over seven lines of a 300 line function, those seven lines should
    ideally be enclosed in curly braces, so the variable is not known
    outside of those lines. Just planting an unwrapped declaration of the variable at the function scope level (outermost block) solves only half
    the problem. The scope of the variable starts close to where the
    variable is used, which is good; but it still goes to the end of the function, way past its actual semantic scope that ends at the last use.

    If your variable is only used in 7 lines out of a 300 line function,
    then perhaps your function is too long?

    I agree that small scopes for variables are good, and put declarations
    within compound statement blocks where practical (though I rarely make a
    new block simply to hold a variable). But that is not the sole purpose
    of mixing code and declarations - it is not even the major purpose, IMHO.

    The point is that you do not declare a variable until you actually have something to put in it. You never have this semi-alive object floating
    around where it is accessible, but has no valid or known state. You
    never have an artificial initialisation, such as putting 0 in a variable declared at the top of the function, in the mistaken believe that it
    makes code somehow "safer". You can make your variables "const" if they
    do not change. If you are using C++ (or other object-oriented
    language), you avoid the inefficiency of default-constructing an object
    and later assigning to it, instead of doing a single initialisation.

    Of course this applies just as well to variables defined inside blocks
    as to variables defined after code.

    A block like this can be repeated with copy and paste:

    {
    int yes = 1;
    setsockopt(fd, SO_WHATEVER, &yes);
    }

    This cannot: you will get redefinition errors:

    int yes = 1;
    setsockopt(fd, SO_WHATEVER, &yes);

    you have to think about ensuring that "int yes" occurs in one place
    that is before the first use, and the other places assign to it.
    Or invent different names.


    This is something that I would prefer C and C++ to allow. I think it
    would improve the structure of some of my code, precisely as you describe.

    I'd also like to be able to write :

    int x = 10;
    x += 20;
    const int x = x; // Fix x - after this, x is constant

    I wonder if anyone has made a proposal for adding this feature to C or C++ ? [Variables at the top probably comes from Algol60 via Pascal. For assembler, depends on the assembler. Lots of them let you have several sections in the program and switch between the code and data sections as you go. IBM mainframe assemblers had this feature in the 1960s. -John]

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Tue Jan 10 15:13:33 2023
    From Newsgroup: comp.compilers

    On Tuesday, January 10, 2023 at 2:16:32 PM UTC-8, David Brown wrote:

    (snip)
    The point is that you do not declare a variable until you actually have something to put in it. You never have this semi-alive object floating
    around where it is accessible, but has no valid or known state. You
    never have an artificial initialisation, such as putting 0 in a variable declared at the top of the function, in the mistaken believe that it
    makes code somehow "safer".

    Java requires that the compiler be able to figure out that a variable
    (well, scalar variable) is given a value before it is used. Most of the
    time, that works out fine. Once in a while, I know that it is given
    a value, but the compiler doesn't. In that case, it is initialized
    to (usually) 0, and a comment indicating why.

    (snip)

    [Variables at the top probably comes from Algol60 via Pascal. For assembler, depends on the assembler. Lots of them let you have several sections in the program and switch between the code and data sections as you go. IBM mainframe
    assemblers had this feature in the 1960s. -John]

    Most of the IBM mainframe assembly code I know, puts the variables
    at the bottom.

    Well, early on I started reading the generated code from compilers,
    and not so much later, the Fortran G&H library. Maybe not the best
    examples of structured code, though.

    Well, if by data section you mean DSECT, I suppose.
    Most I knew didn't do much of that, though.

    Variables at the bottom, and use the same base register for
    code and data. You could put the variables at the top, and
    branch around them. I don't remember anyone doing that.

    More recent processors, maybe from ESA/390 days, have
    data and instruction cache, and want data and instructions
    more than (if I remember) 256 bytes apart.

    But okay, it is completely different for reentrant programs,
    than the static allocation non-reentrant code of much of
    OS/360 associated programs.
    [For IBM assember, I meant that you could have one CSECT for
    your code and another for your data, and you could switch
    between them as you go along. For the elite few of us who
    used TSS/360, a PSECT let you set the initial contents of
    the RW data that the RO code in a routine used. DSECT was
    somehing else, more like a structure definition. I realize
    that in practice most code was not reentrant and you put
    all your code and data in one section. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.compilers on Wed Jan 11 10:49:15 2023
    From Newsgroup: comp.compilers

    Kaz Kylheku <864-117-4973@kylheku.com> schrieb:

    The "Variables at the top" meme may be something coming from Pascal.
    IIRC, in Pascal, compound statements aren't full blocks; they cannot
    have VAR declarations.

    FORTRAN has had declaration statements (first version, DIMENSION
    only) at the top of procedures since the beginning. Algol 58
    aka IAL had declarations everywere, while Algol 60 allowed them
    only at the beginning of blocks.

    When programmers abandoned Pascal in the 1980s, they carried over this
    habit into C.

    Probably, C just carried it over from the Algol tradition.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Wed Jan 11 11:02:56 2023
    From Newsgroup: comp.compilers

    On 2023-01-10, David Brown <david.brown@hesbynett.no> wrote:
    On 09/01/2023 18:41, Kaz Kylheku wrote:
    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    A block like this can be repeated with copy and paste:

    {
    int yes = 1;
    setsockopt(fd, SO_WHATEVER, &yes);
    }

    This cannot: you will get redefinition errors:

    int yes = 1;
    setsockopt(fd, SO_WHATEVER, &yes);

    you have to think about ensuring that "int yes" occurs in one place
    that is before the first use, and the other places assign to it.
    Or invent different names.

    This is something that I would prefer C and C++ to allow. I think it
    would improve the structure of some of my code, precisely as you describe.

    It seems that Scheme, with its ugly (define ...) that can be used inside
    block scopes, has the same restriction!

    I tried (lambda () (define x 42) (define x 43)) in a Scheme
    implementation and got an error about the duplicate variable.

    That's completely silly since it breaks the idea that the block scoped
    define can just be desugared to nested lets.

    On a related topic, the CLISP implementation of Common Lisp, whose
    history goes back to the 1980s, availed itself of mixing variable
    declarations and statements even in C90. Its source files are named with
    a .d suffix, and are preproced by a "varbrace" tool which spits out
    the brace-enclosed blocks. I seem to recall that variables are prefixed
    with a "var" specifier, which probably makes it easy for the tool to
    recognize declarations.

    It may likely be the case that under CLISP's "varbrace" you can repeat
    variable names.

    ... and searching for varbrace, I see a 2017 thread in the CLISP
    mailing list by someone who posted a patch to rid CLISP of varbrace,
    and just use C99. The patch submmitter mentions that he had to rename
    some instances of repeated variables, making this remark:

    Another issue is conflicting definitions of the same variable. Example:

    var type1 foo;
    // some code
    var type2 foo;

    This is solved by renaming one of them, if possible. In two places, I
    manually added braces (like varbrace would've done)

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Bill Findlay@findlaybill@blueyonder.co.uk to comp.compilers on Wed Jan 11 11:58:52 2023
    From Newsgroup: comp.compilers

    On 10 Jan 2023, David Brown wrote
    (in article <23-01-033@comp.compilers>):

    [Variables at the top probably comes from Algol60 via Pascal. ... -John]

    And via Algol 60 from FORTRAN.

    --
    Bill Findlay
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Wed Jan 11 13:38:09 2023
    From Newsgroup: comp.compilers

    On 11/01/2023 00:13, gah4 wrote:
    On Tuesday, January 10, 2023 at 2:16:32 PM UTC-8, David Brown wrote:

    (snip)
    The point is that you do not declare a variable until you actually have
    something to put in it. You never have this semi-alive object floating
    around where it is accessible, but has no valid or known state. You
    never have an artificial initialisation, such as putting 0 in a variable
    declared at the top of the function, in the mistaken believe that it
    makes code somehow "safer".

    Java requires that the compiler be able to figure out that a variable
    (well, scalar variable) is given a value before it is used. Most of the time, that works out fine. Once in a while, I know that it is given
    a value, but the compiler doesn't. In that case, it is initialized
    to (usually) 0, and a comment indicating why.


    The same applies to C and C++ programming, when using static error
    checking. (And during development, you should definitely be using a
    compiler capable of spotting missing initialisations, and you should
    treat such warnings as bugs in your code.) And like Java tools, C and
    C++ compilers are not /quite/ perfect :-)

    So I agree that there are occasional uses for such "artificial"
    initialisation. There are also occasions when declaring a variable
    without initialising makes sense because you will later set its value
    inside a conditional.

    But in general, declaring and initialising at the same time is better
    (IMHO). And artificial zero initialisation is a bad thing, precisely
    because it effectively disables the kind of warning checks you get in
    Java and in C/C++ with an appropriate compiler or linter.

    (snip)

    [Variables at the top probably comes from Algol60 via Pascal. For assembler, >> depends on the assembler. Lots of them let you have several sections in the >> program and switch between the code and data sections as you go. IBM mainframe
    assemblers had this feature in the 1960s. -John]

    Most of the IBM mainframe assembly code I know, puts the variables
    at the bottom.

    That's new to me - but I have no experience with mainframes. In all the assembly I have done (lots of different microcontrollers), I have always
    had the data declared before the code that uses them. I've alternated
    between data and code sections, but not within functions.

    But maybe this is because most of the small microcontrollers I
    programmed were pretty hopeless at dealing with data on a stack, and it
    was normal to put local variables in data sections - you have static addressing, rather than through base pointers or frame pointers. It is
    quite different from how you work with "big" processors - even in the
    days when the "big" processors were slower and had less memory than
    modern "small" processors, if you understand what I mean.

    Thanks for the history titbits here. It's always fun to hear about.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Wed Jan 11 16:38:32 2023
    From Newsgroup: comp.compilers

    On Wednesday, January 11, 2023 at 3:10:38 PM UTC-8, David Brown wrote:

    (snip, I wrote)
    Java requires that the compiler be able to figure out that a variable (well, scalar variable) is given a value before it is used.

    (snip)
    The same applies to C and C++ programming, when using static error
    checking. (And during development, you should definitely be using a
    compiler capable of spotting missing initialisations, and you should
    treat such warnings as bugs in your code.) And like Java tools, C and
    C++ compilers are not /quite/ perfect :-)

    As well as I know it, this isn't optional in Java.

    So I agree that there are occasional uses for such "artificial" initialisation. There are also occasions when declaring a variable
    without initialising makes sense because you will later set its value
    inside a conditional.

    (snip)
    [Variables at the top probably comes from Algol60 via Pascal. For assembler,
    depends on the assembler. Lots of them let you have several sections in the
    program and switch between the code and data sections as you go. IBM mainframe
    assemblers had this feature in the 1960s. -John]

    I remember knowing that, but rarely seeing it. In high school years, I would read IBM manuals, and so knew many things that I would never use.

    And since each card of the object program has its address, and which CSECT
    it belongs to, the linker can figure it all out.

    Most of the IBM mainframe assembly code I know, puts the variables
    at the bottom.

    That's new to me - but I have no experience with mainframes. In all the assembly I have done (lots of different microcontrollers), I have always
    had the data declared before the code that uses them. I've alternated
    between data and code sections, but not within functions.

    If you allow forward references for code (that is, forward jumps),
    you can also have forward references for data.

    Well, this all works if you have (at least) a two pass assembler.
    Find the address of everything the first time through, and then
    generate code on the second pass.

    But maybe this is because most of the small microcontrollers I
    programmed were pretty hopeless at dealing with data on a stack, and it
    was normal to put local variables in data sections - you have static addressing, rather than through base pointers or frame pointers. It is
    quite different from how you work with "big" processors - even in the
    days when the "big" processors were slower and had less memory than
    modern "small" processors, if you understand what I mean.

    I am now remembering that the early IBM Fortran compilers put code
    at the bottom of address space, and data at the top, going down.

    One early Fortran feature that got removed later, is the ability to chain.
    One (whole) program can chain to another, which then replaces it
    in memory, but with all COMMON variables still there.

    I suspect that this was replaced by overlay, which is a better
    solution to the problem. It might be that, which led to putting
    data after code.

    Another one is that with base-displacement addressing, you only
    need to be able to address the beginning of an array. Putting large
    arrays at the end, then, makes it easier.

    That is also the way Fortran code generators did it for OS/360.
    (Static addressed with code and data together.)

    Early microcomputer assemblers might not have been as good,
    though as well as I know it, Microsoft used the PDP-10 assembler
    from the beginning.

    A favorite way to write microcomputer assembly code, is with
    macros on a mainframe assembler, one for each instruction.
    All the hard work is then done by the assembler itself.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.compilers on Thu Jan 12 02:54:37 2023
    From Newsgroup: comp.compilers

    On Wed, 11 Jan 2023 11:02:56 -0000 (UTC), Kaz Kylheku <864-117-4973@kylheku.com> wrote:


    It seems that Scheme, with its ugly (define ...) that can be used inside >block scopes, [disallows name redefinition]!

    Agree it's ugly: I never use internal defines in my own code.
    Unfortunately some people love them.


    I tried (lambda () (define x 42) (define x 43)) in a Scheme
    implementation and got an error about the duplicate variable.

    That's completely silly since it breaks the idea that the block scoped
    define can just be desugared to nested lets.


    Unfortunately the experience varies by version:


    From R3RS to R5RS, a series of internal defines is treated AS IF they
    all are part of a single 'letrec' with the scope being the whole body. 'letrec', of course, does not permit multiple bindings to the same
    name.

    In R6RS and R7RS, a series of internal defines is treated as a
    'letrec*' [note trailing *]. letrec* is equivalent to nested letrec
    and so does permit rebinding to the same name ... which will shadow
    any previous bindings.



    from R5RS 5.2.2 Internal Definitions

    ... The variable defined by an internal defnition is local to the
    body. That is, variable is bound rather than assigned, and the region
    of the binding is the entire body. For example,

    (let ((x 5))
    (define foo (lambda (y) (bar x y)))
    (define bar (lambda (a b) (+ (* a b) a)))
    (foo (+ x 3))) => 45

    A body containing internal definitions can always be converted into a completely equivalent letrec expression. For example, the let
    expression in the above example is equivalent to

    (let ((x 5))
    (letrec ((foo (lambda (y) (bar x y)))
    (bar (lambda (a b) (+ (* a b) a))))
    (foo (+ x 3))))

    Just as for the equivalent letrec expression, it must be possible to
    evaluate each expression of every internal definition in a body
    without assigning or referring to the value of any variable being
    defined.



    From R6RS 11.3 Bodies

    The <body> of a lambda, let, let*, let-values, let*-values, letrec, or
    letrec* expression, or that of a definition with a body consists of
    zero or more definitions followed by one or more expressions.

    <definition> ... <expression1> <expression2> ...

    Each identifier defined by a definition is local to the <body>. That
    is, the identifier is bound, and the region of the binding is the
    entire <body> (see section-a[5.2]).

    Example:

    (let-a((x-a5))
    -a-a(define-afoo-a(lambda-a(y)-a(bar-ax-ay))) -a-a(define-abar-a(lambda-a(a-ab)-a(+-a(*-aa-ab)-aa))) -a-a(foo-a(+-ax-a3)))-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a=>-a-a45

    When begin, let-syntax, or letrec-syntax forms occur in a body prior
    to the first expression, they are spliced into the body; see
    section-a[11.4.7]. Some or all of the body, including portions wrapped
    in begin, let-syntax, or letrec-syntax forms, may be specified by a
    macro use (see section-a[9.2]).

    An expanded <body> (see chapter-a[10]) containing variable definitions
    can always be converted into an equivalent letrec* expression. For
    example, the let expression in the above example is equivalent to

    (let-a((x-a5))
    -a-a(letrec*-a((foo-a(lambda-a(y)-a(bar-ax-ay))) -a-a-a-a-a-a-a-a-a-a-a-a(bar-a(lambda-a(a-ab)-a(+-a(\*-aa-ab)-aa)))) -a-a-a-a(foo-a(+-ax-a3))))
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Nils M Holm@nmh@t3x.org to comp.compilers on Thu Jan 12 11:15:25 2023
    From Newsgroup: comp.compilers

    Kaz Kylheku <864-117-4973@kylheku.com> wrote:
    I tried (lambda () (define x 42) (define x 43)) in a Scheme
    implementation and got an error about the duplicate variable.

    That's completely silly since it breaks the idea that the block scoped
    define can just be desugared to nested lets.

    If I am not completely mistaken, local DEFINE expands to LETREC
    and not to nested LET, so your example would result in two
    instances of X in the same scope:

    (lambda ()
    (letrec ((x 42)
    (x 43))))

    --
    Nils M Holm < n m h @ t 3 x . o r g > http://t3x.org
    [See the more complete analysis just posted. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tristan B. Velloza Kildaire@deavmi@redxen.eu to comp.compilers on Fri Jan 13 13:41:37 2023
    From Newsgroup: comp.compilers

    On 2023/01/02 22:52, Spiros Bousbouras wrote:
    I plan to, by
    the end of 2023 hopefully, have a full release out. The code emit is
    already working well and so is the dependency tree algorithmn.

    Code emitter for what targets ?

    Code emitter emits C currently, so whatever your C compiler can target

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tristan B. Velloza Kildaire@deavmi@redxen.eu to comp.compilers on Fri Jan 13 14:17:56 2023
    From Newsgroup: comp.compilers

    On 2023/01/05 02:12, Hans-Peter Diettrich wrote:
    On 1/2/23 11:28 AM, Tristan B. Velloza Kildaire wrote:

    I am currently working on my own compiler for something like C

    Define "like C".

    DoDi
    [Grouped with {} ?-a Comments with /* */ ?-a Designed by a guy who
    worked for the phone company? -John]

    Think of C, but with object orientation similiar to C++ added, however
    single inheritance, interface support (as per C++ as well). Really java
    OOP's model but attached into C.

    Kind of not too keen on generics as I think they blur source code
    clarity somewhat. Still open to suggestions.

    But stripped down C++ in the sense of avoiding many pitfalls such as
    weird syntax for certain declarations.

    There are other things I aim to fix, fixing the evaluation ordering of
    function arguments (as that is unspecified) - this is done in the DGen
    (C code emitter level). That's what I am going for.


    I have a memory model for OOP planned out, the OOP features will come in at
    a later stage, parsing and most of dependency tree initialization is
    there but needs some tweaking. Currently working on getting modules
    supported correctly.

    Flow control is there from lex to emit, functions etc etc. Expressions
    and all, I will keep everyone up to date and know when there is an alpha
    out. I just want to polish the remaining needed features, the soirce
    code itself (as I intend it to be readbale for others who want to
    contribute).

    Development has picked up quite a lot seeing that I now finished
    university. STtarting work soon but intend to use free time to work on it.

    When it is all polished I will open source it, clean up docs and also
    work on the "book" section which aims to be a language manual and implementation manual (explaining the inner workings).
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Luke A. Guest@laguest@archeia.com to comp.compilers on Fri Jan 13 18:25:37 2023
    From Newsgroup: comp.compilers

    On 09/01/2023 17:41, Kaz Kylheku wrote:
    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    don't want to go through them all, but I agree with you that the style
    of "all your declarations at the start of the function" is long
    outdated, and often - but not universally - considered a bad idea.)

    Declarations have never been required to be at the top of a function in
    C, because they can be in any compound statement block. I think
    that goes all the way back to the B language. [Nope, see the next message. -John]

    When I learnt C, you had to define your variables at the top of the
    block {} whether that's a function or a block within the function somewhere.

    The "Variables at the top" meme may be something coming from Pascal.

    Nope. Algol. C is an Algol derived language.

    IIRC, in Pascal, compound statements aren't full blocks; they cannot
    have VAR declarations.

    When programmers abandoned Pascal in the 1980s, they carried over this
    habit into C.

    Nope, this was defined in the C spec and the K&R book. Apparently this
    has been relaxed recently-ish and now variables can be defined anywhere.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Fri Jan 13 10:32:16 2023
    From Newsgroup: comp.compilers

    On Friday, January 13, 2023 at 10:00:11 AM UTC-8, Tristan B. Velloza Kildaire wrote:

    (snip)

    Think of C, but with object orientation similiar to C++ added, however
    single inheritance, interface support (as per C++ as well). Really java
    OOP's model but attached into C.

    I always thought that Java was, intentionally, more like C than C++ is like C.

    That was even before they added System.out.format(), which should make
    C programmers even happier.

    Some time ago, I was trying to figure out if you could make a C compiler
    that generated JVM code. I would run much closer to the C standard
    than much C code does, especially regarding casting of pointers.
    [So what did you conclude? I'd think C type casts would be hard to
    turn into Java unless you made all of storage an opaque block. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Fri Jan 13 12:39:41 2023
    From Newsgroup: comp.compilers

    On Friday, January 13, 2023 at 11:10:37 AM UTC-8, gah4 wrote:

    (snip)

    Some time ago, I was trying to figure out if you could make a C compiler
    that generated JVM code. I would run much closer to the C standard
    than much C code does, especially regarding casting of pointers.

    [So what did you conclude? I'd think C type casts would be hard to
    turn into Java unless you made all of storage an opaque block. -John]

    Someone else might have thought about the "opaque block" method.
    But that wouldn't work if you wanted to call between Java and C.

    As well as I know it, C only requires assignment to work for
    pointers cast to (unsigned char *). And once they are cast,
    usually (though I suppose not always), it is done with memcpy(),
    or compared with memcmp().

    So, all the complication of figuring out what is actually being
    done, can be done inside one of those.

    C pointers, then, are an object with a reference to the actual
    array, and current offset within the array, and bounds for
    the array. Pointer arithmetic only changes the offset.

    Scalar variables that can be pointed to, compile as arrays
    dimensioned [1].

    I didn't get as far as figuring out varargs functions, but someone
    must have done that, as System.out.format() works.
    You can call it with the usual different argument types,
    and it figures out everything.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.compilers on Fri Jan 13 17:20:35 2023
    From Newsgroup: comp.compilers

    On Fri, 13 Jan 2023 18:25:37 +0000, "Luke A. Guest"
    <laguest@archeia.com> wrote:

    C is an Algol derived language.

    Well ... Algol influenced in any case, through B.
    [Rather indirectly. C is derived from B which was a
    stripped down version of BCPL which was a boostrap
    subset of CPL which was designed in the early 1960s
    with influence from Algol. See Wikipedia. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Sat Jan 14 19:07:50 2023
    From Newsgroup: comp.compilers

    On 2023-01-13, Luke A. Guest <laguest@archeia.com> wrote:
    On 09/01/2023 17:41, Kaz Kylheku wrote:
    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    don't want to go through them all, but I agree with you that the style
    of "all your declarations at the start of the function" is long
    outdated, and often - but not universally - considered a bad idea.)

    Declarations have never been required to be at the top of a function in
    C, because they can be in any compound statement block. I think
    that goes all the way back to the B language. [Nope, see the next message. -John]

    When I learnt C, you had to define your variables at the top of the
    block {} whether that's a function or a block within the function somewhere.

    Well yes; that is the situation in ISO C 90. ISO C 99 introduced mixed declarations and statements. Or, perhaps we should say, the C++ dialect introduced this (and standardized it first, in 1998).

    The "Variables at the top" meme may be something coming from Pascal.

    Nope. Algol. C is an Algol derived language.

    IIRC, in Pascal, compound statements aren't full blocks; they cannot
    have VAR declarations.

    When programmers abandoned Pascal in the 1980s, they carried over this
    habit into C.

    Nope, this was defined in the C spec and the K&R book. Apparently this
    has been relaxed recently-ish and now variables can be defined anywhere.

    The discussion is about whether variables must be declared at the top of
    the entire function, even if it has nested compound statements where
    some of them could be declared.

    The top-of-function restriction exists in Pascal; compound statements
    do not have VAR section.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    [Algol60 allowed declarations at the start of every block, so I think
    it was one of the things Pascal left out to make it easier to compile.
    It does make one-pass compiling with tiny memory easier. These days,
    that's irrelevant. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From marb...@yahoo.co.uk@marblypup@yahoo.co.uk to comp.compilers on Sun Jan 15 04:21:52 2023
    From Newsgroup: comp.compilers

    On Wednesday, 11 January 2023 at 23:07:33 UTC, Thomas Koenig wrote:
    Algol 58
    aka IAL had declarations everywere, while Algol 60 allowed them
    only at the beginning of blocks.

    But Algol 68 allows them anywhere:

    INT x:=42;
    print(("x=",x,newline));
    INT y:=76;
    print(("y=",y,newline));
    x:=y:=101;
    print(("x=",x,newline,"y=",y,newline))

    (There's something a bit weird about it in A68, though, but I can't remember what.)
    (I tend to think of A68 as "A60 made perfect... then some dubious
    features are added". I gather it took 6 years to write the first
    complete A68 compiler! Well, strictly speaking, they'd revised A68 by
    then, so it was an A68R compiler.)
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From marb...@yahoo.co.uk@marblypup@yahoo.co.uk to comp.compilers on Sun Jan 15 04:26:48 2023
    From Newsgroup: comp.compilers

    On Wednesday, 11 January 2023 at 23:10:38 UTC, David Brown wrote:
    The same applies to C and C++ programming, when using static error
    checking. (And during development, you should definitely be using a
    compiler capable of spotting missing initialisations, and you should
    treat such warnings as bugs in your code.) And like Java tools, C and
    C++ compilers are not /quite/ perfect :-)

    So I agree that there are occasional uses for such "artificial" initialisation. There are also occasions when declaring a variable
    without initialising makes sense because you will later set its value
    inside a conditional.

    Indeed. I've occasionally had compilers complain about uninitialised
    variables though I could see that my (rather perverse?) code did
    always initialise them before they were used (but possibly not if they weren't). In such cases, I've added an initialisation to the
    declaration just to shut the compiler up. (Warnings of uninitialised
    variables usually do indicate bugs.)

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Andy Walker@anw@cuboid.co.uk to comp.compilers on Sun Jan 15 22:01:00 2023
    From Newsgroup: comp.compilers

    On 15/01/2023 12:21, marb...@yahoo.co.uk wrote:
    [...] I gather it took 6 years to write the first
    complete A68 compiler! Well, strictly speaking, they'd revised A68 by
    then, so it was an A68R compiler.

    A68-R was announced in July 1970, so much less than 6 years
    from the A68 Report [December 1968]. It wasn't entirely a complete
    compiler, having a few changes primarily to permit single-pass
    compilation [but also, eg, "CODE ... EDOC" brackets for embedded
    machine code]. Nothing to do with the Revised Report, which was
    several years later, and incorporated the experience of A68-R. It
    also took much less than 6 years from the publication of the RR to
    the first compilers.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Byrd
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Sun Jan 29 21:39:26 2023
    From Newsgroup: comp.compilers

    On Sunday, January 29, 2023 at 8:57:52 AM UTC-8, dave_th...@comcast.net wrote:
    On Fri, 13 Jan 2023 12:39:41 -0800 (PST), gah4 <ga...@u.washington.edu>

    (I wrote)
    Some time ago, I was trying to figure out if you could make a C compiler that generated JVM code. I would run much closer to the C standard
    than much C code does, especially regarding casting of pointers.

    [So what did you conclude? I'd think C type casts would be hard to
    turn into Java unless you made all of storage an opaque block. -John]

    Someone else might have thought about the "opaque block" method.
    But that wouldn't work if you wanted to call between Java and C.

    As well as I know it, C only requires assignment to work for
    pointers cast to (unsigned char *). And once they are cast,
    usually (though I suppose not always), it is done with memcpy(),
    or compared with memcmp().

    Only unsigned char is 100% guaranteed, but on all known systems today
    signed char has no trap rep and also works and so does plain char.

    But if the standard says (unsigned char *), and it failed with other types, would it still be C?

    In any case, I would put all the complications into memcpy() and memcmp().

    Assignments cast to (unsigned char *) could call memcpy().
    Otherwise, they would be assumed to work.

    (snip)

    I didn't get as far as figuring out varargs functions, but someone
    must have done that, as System.out.format() works.
    You can call it with the usual different argument types,
    and it figures out everything.

    Java's System.out.format -- and Java's varargs in general -- works differently than C (at least C as practiced; the standard imposes
    enough restrictions you probably _could_ implement it differently).

    When Java calls a varargs method, the _caller_ silently creates an
    array and fills it with the argument values, alll converted to the one
    type specified in the definition (or compiled equivalent), and that
    _array_ is actually passed along with the fixed args, in this case the
    format string and possibly locale. For this case the one type is java.lang.Object, which is the top-type for all class _and_ array(1) instances in Java so they pass unchanged; any primitive value (int,
    float, etc) is siliently converted to an instance of a builtin class (java.lang.Integer, java.lang.Float, etc) by 'autoboxing'. As a result
    the format method(2) just matches format specifiers to elements of
    that array (remember each Java array instance knows its own length so subscripting out of bounds traps).

    But also, Java didn't always do that automatically.

    Or more simply, Java varargs is sugar for a homogenous array.

    I suppose, but it is a lot of sugar!
    Having to do the array creating, and all the conversions to fill
    the array is a lot of work! And a lot of cases to get wrong.

    Some years ago, I was doing Practice it!, which requests you,
    in many cases, to write a Java program. The system then compiles
    and runs your program, and verifies the output. It mostly makes no
    requirement on how you write it. At some point, I started
    using System.out.format() for my output statements.

    If you want to try it: https://practiceit.cs.washington.edu/

    Anyway, yes, that is what I thought Java did with them.
    Though some of my programs use arrays dimensioned [1]
    instead of the usual wrapper classes.
    --- Synchronet 3.21b-Linux NewsLink 1.2