I am currently working on my own compiler for something like C but with minimal object orientation support and no features like
templating/generics etc etc.
Trying to get a feeler out there for anyone who would be interested in
using such a language, obviously the project is something I work on in
my spare time but I have written everything from scratch.
I plan to, by
the end of 2023 hopefully, have a full release out. The code emit is
already working well and so is the dependency tree algorithmn.
On 2 Jan 2023, at 20:52, Spiros Bousbouras <spibou@gmail.com> wrote:--- Synchronet 3.21b-Linux NewsLink 1.2
On Mon, 2 Jan 2023 12:28:12 +0200
"Tristan B. Velloza Kildaire" <deavmi@redxen.eu> wrote:
I am currently working on my own compiler for something like C but with
minimal object orientation support and no features like
templating/generics etc etc.
Trying to get a feeler out there for anyone who would be interested in
using such a language, obviously the project is something I work on in
my spare time but I have written everything from scratch.
Knowing what you are trying to achieve i.e. why you are creating a new programming language would be useful. On which operating systems is it
going to work ? What will be the license ?
I plan to, by
the end of 2023 hopefully, have a full release out. The code emit is
already working well and so is the dependency tree algorithmn.
Code emitter for what targets ?
IrCOm not sure there would be that much demand for a cut down C.
a start for a new C++ like language.
This is true that it's around, but I think it has copyright / license limitations that would prevent building something new on top of it.
[The copy at the computer history museum says "The source code in this section is posted with the permission of the copyright owner for
historical research purposes only." It's from 1997 so I would think
it's a long way from modern C++. -John]
I am currently working on my own compiler for something like C
IrCOm not sure there would be that much demand for a cut down C.
I'm up for reading the source of any relatively simple compiler for, and written in, anything C-like. I've tried making sense of the GNU C compiler a few times. My brain may recover one day!
On Tuesday, 3 January 2023 at 17:45:17 UTC, Steve Limb wrote:
IrCOm not sure there would be that much demand for a cut down C.
I recently read (well, skimmed) http://www.mjbauer.biz/C-less%20Reference%20Manual.pdf
"A concise subset of the C programming language".
Though I'm a bit baffled by some of Bauer's choices. Why is
`char *foo="foo", *bar="bar"; puts(foo); puts(bar);`
allowed but not
`char *foo="foo"; puts(foo); char *bar="bar"; puts(bar);`
? Admittedly, the latter is only allowed in relatively recent C, but from my (very limited) experience writing compilers, the latter is no harder to compile.
I idly thought about adding stuff to C-less and calling it C-more-or-less, Cmol, for short.
I'm up for reading the source of any relatively simple compiler for, and written in, anything C-like. I've tried making sense of the GNU C compiler a few times. My brain may recover one day!
[If you're doing a one-pass compiler, it's easier if all the declarations are at the
beginning so you can generate the code to set up the stack frame and do initializations.
I agree that on modern computers it's not a big deal, but remember that early C compilers
ran in 24K bytes and I don't mean meagabytes. -John]
[If you're doing a one-pass compiler, it's easier if all the declarations are at the
beginning so you can generate the code to set up the stack frame and do initializations.
I agree that on modern computers it's not a big deal, but remember that early C compilers
ran in 24K bytes and I don't mean meagabytes. -John]
Presumably such a compiler would have to create 2 stack frames for
`char *foo="foo"; puts(foo); { char *bar="bar"; puts(bar); }`
[In a mutant version of C with nested scopes, I suppose so, but when C compilers
ran in 24K bytes, it didn't. -John]
In other words, it can combine all the variables declared in nested
scope and act as though they were all defined at the start of the
function.
don't want to go through them all, but I agree with you that the style
of "all your declarations at the start of the function" is long
outdated, and often - but not universally - considered a bad idea.)
[In a mutant version of C with nested scopes, I suppose so, but when C compilers
ran in 24K bytes, it didn't. -John]
I don't have my copy of K&R handy, or a pre-K&R Unix C manuals, but I
expect someone will correct me if I'm wrong :-) As far as I know, the C described in "The C Programming Language" in 1978, when 24 KB was still
a big deal, supported declarations at the start of any compound
statement block. That is, nested scopes. It's possible that pre-K&R C compilers were more limited.
[I actually used that 24K C compiler in about 1975 and I am reasonably sure it did not let you put declarations other than in the outer block. There's
a 1978 edition of K&R at archive.org and by then it did let you put declarations in any block. It's a little harder than what you say because declarations in non-overlapping blocks should overlay each other, e.g.:
foo() {
int a:
...
{
int b[100];
somefunc(b);
}
{
float c[100];
otherfunc(c);
}
}
you want b and c to use the same storage. It's not hard, but it's a little more than promoting and renaming. -John]
On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
don't want to go through them all, but I agree with you that the style
of "all your declarations at the start of the function" is long
outdated, and often - but not universally - considered a bad idea.)
Declarations have never been required to be at the top of a function in
C, because they can be in any compound statement block. I think
that goes all the way back to the B language. [Nope, see the next message. -John]
The "Variables at the top" meme may be something coming from Pascal.
IIRC, in Pascal, compound statements aren't full blocks; they cannot
have VAR declarations.
When programmers abandoned Pascal in the 1980s, they carried over this
habit into C.
I hate mixed declarations and code because it's almost as bad as variables-at-the-top. The scope of a declaration that is just planted
into the middle of a compound statement block extends all the way to the
end of the block. There should be a smaller enclosing block which
exactly delimits the scope of that variable. If some variable is used
over seven lines of a 300 line function, those seven lines should
ideally be enclosed in curly braces, so the variable is not known
outside of those lines. Just planting an unwrapped declaration of the variable at the function scope level (outermost block) solves only half
the problem. The scope of the variable starts close to where the
variable is used, which is good; but it still goes to the end of the function, way past its actual semantic scope that ends at the last use.
A block like this can be repeated with copy and paste:
{
int yes = 1;
setsockopt(fd, SO_WHATEVER, &yes);
}
This cannot: you will get redefinition errors:
int yes = 1;
setsockopt(fd, SO_WHATEVER, &yes);
you have to think about ensuring that "int yes" occurs in one place
that is before the first use, and the other places assign to it.
Or invent different names.
The point is that you do not declare a variable until you actually have something to put in it. You never have this semi-alive object floating
around where it is accessible, but has no valid or known state. You
never have an artificial initialisation, such as putting 0 in a variable declared at the top of the function, in the mistaken believe that it
makes code somehow "safer".
[Variables at the top probably comes from Algol60 via Pascal. For assembler, depends on the assembler. Lots of them let you have several sections in the program and switch between the code and data sections as you go. IBM mainframe
assemblers had this feature in the 1960s. -John]
The "Variables at the top" meme may be something coming from Pascal.
IIRC, in Pascal, compound statements aren't full blocks; they cannot
have VAR declarations.
When programmers abandoned Pascal in the 1980s, they carried over this
habit into C.
On 09/01/2023 18:41, Kaz Kylheku wrote:
On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
A block like this can be repeated with copy and paste:
{
int yes = 1;
setsockopt(fd, SO_WHATEVER, &yes);
}
This cannot: you will get redefinition errors:
int yes = 1;
setsockopt(fd, SO_WHATEVER, &yes);
you have to think about ensuring that "int yes" occurs in one place
that is before the first use, and the other places assign to it.
Or invent different names.
This is something that I would prefer C and C++ to allow. I think it
would improve the structure of some of my code, precisely as you describe.
[Variables at the top probably comes from Algol60 via Pascal. ... -John]
On Tuesday, January 10, 2023 at 2:16:32 PM UTC-8, David Brown wrote:
(snip)
The point is that you do not declare a variable until you actually have
something to put in it. You never have this semi-alive object floating
around where it is accessible, but has no valid or known state. You
never have an artificial initialisation, such as putting 0 in a variable
declared at the top of the function, in the mistaken believe that it
makes code somehow "safer".
Java requires that the compiler be able to figure out that a variable
(well, scalar variable) is given a value before it is used. Most of the time, that works out fine. Once in a while, I know that it is given
a value, but the compiler doesn't. In that case, it is initialized
to (usually) 0, and a comment indicating why.
(snip)
[Variables at the top probably comes from Algol60 via Pascal. For assembler, >> depends on the assembler. Lots of them let you have several sections in the >> program and switch between the code and data sections as you go. IBM mainframe
assemblers had this feature in the 1960s. -John]
Most of the IBM mainframe assembly code I know, puts the variables
at the bottom.
Java requires that the compiler be able to figure out that a variable (well, scalar variable) is given a value before it is used.
The same applies to C and C++ programming, when using static error
checking. (And during development, you should definitely be using a
compiler capable of spotting missing initialisations, and you should
treat such warnings as bugs in your code.) And like Java tools, C and
C++ compilers are not /quite/ perfect :-)
So I agree that there are occasional uses for such "artificial" initialisation. There are also occasions when declaring a variable
without initialising makes sense because you will later set its value
inside a conditional.
[Variables at the top probably comes from Algol60 via Pascal. For assembler,
depends on the assembler. Lots of them let you have several sections in the
program and switch between the code and data sections as you go. IBM mainframe
assemblers had this feature in the 1960s. -John]
Most of the IBM mainframe assembly code I know, puts the variables
at the bottom.
That's new to me - but I have no experience with mainframes. In all the assembly I have done (lots of different microcontrollers), I have always
had the data declared before the code that uses them. I've alternated
between data and code sections, but not within functions.
But maybe this is because most of the small microcontrollers I
programmed were pretty hopeless at dealing with data on a stack, and it
was normal to put local variables in data sections - you have static addressing, rather than through base pointers or frame pointers. It is
quite different from how you work with "big" processors - even in the
days when the "big" processors were slower and had less memory than
modern "small" processors, if you understand what I mean.
It seems that Scheme, with its ugly (define ...) that can be used inside >block scopes, [disallows name redefinition]!
I tried (lambda () (define x 42) (define x 43)) in a Scheme
implementation and got an error about the duplicate variable.
That's completely silly since it breaks the idea that the block scoped
define can just be desugared to nested lets.
I tried (lambda () (define x 42) (define x 43)) in a Scheme
implementation and got an error about the duplicate variable.
That's completely silly since it breaks the idea that the block scoped
define can just be desugared to nested lets.
I plan to, by
the end of 2023 hopefully, have a full release out. The code emit is
already working well and so is the dependency tree algorithmn.
Code emitter for what targets ?
On 1/2/23 11:28 AM, Tristan B. Velloza Kildaire wrote:
I am currently working on my own compiler for something like C
Define "like C".
DoDi
[Grouped with {} ?-a Comments with /* */ ?-a Designed by a guy who
worked for the phone company? -John]
On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
don't want to go through them all, but I agree with you that the style
of "all your declarations at the start of the function" is long
outdated, and often - but not universally - considered a bad idea.)
Declarations have never been required to be at the top of a function in
C, because they can be in any compound statement block. I think
that goes all the way back to the B language. [Nope, see the next message. -John]
The "Variables at the top" meme may be something coming from Pascal.
IIRC, in Pascal, compound statements aren't full blocks; they cannot
have VAR declarations.
When programmers abandoned Pascal in the 1980s, they carried over this
habit into C.
Think of C, but with object orientation similiar to C++ added, however
single inheritance, interface support (as per C++ as well). Really java
OOP's model but attached into C.
Some time ago, I was trying to figure out if you could make a C compiler
that generated JVM code. I would run much closer to the C standard
than much C code does, especially regarding casting of pointers.
[So what did you conclude? I'd think C type casts would be hard to
turn into Java unless you made all of storage an opaque block. -John]
C is an Algol derived language.
On 09/01/2023 17:41, Kaz Kylheku wrote:
On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
don't want to go through them all, but I agree with you that the style
of "all your declarations at the start of the function" is long
outdated, and often - but not universally - considered a bad idea.)
Declarations have never been required to be at the top of a function in
C, because they can be in any compound statement block. I think
that goes all the way back to the B language. [Nope, see the next message. -John]
When I learnt C, you had to define your variables at the top of the
block {} whether that's a function or a block within the function somewhere.
The "Variables at the top" meme may be something coming from Pascal.
Nope. Algol. C is an Algol derived language.
IIRC, in Pascal, compound statements aren't full blocks; they cannot
have VAR declarations.
When programmers abandoned Pascal in the 1980s, they carried over this
habit into C.
Nope, this was defined in the C spec and the K&R book. Apparently this
has been relaxed recently-ish and now variables can be defined anywhere.
Algol 58
aka IAL had declarations everywere, while Algol 60 allowed them
only at the beginning of blocks.
The same applies to C and C++ programming, when using static error
checking. (And during development, you should definitely be using a
compiler capable of spotting missing initialisations, and you should
treat such warnings as bugs in your code.) And like Java tools, C and
C++ compilers are not /quite/ perfect :-)
So I agree that there are occasional uses for such "artificial" initialisation. There are also occasions when declaring a variable
without initialising makes sense because you will later set its value
inside a conditional.
[...] I gather it took 6 years to write the first
complete A68 compiler! Well, strictly speaking, they'd revised A68 by
then, so it was an A68R compiler.
On Fri, 13 Jan 2023 12:39:41 -0800 (PST), gah4 <ga...@u.washington.edu>
Some time ago, I was trying to figure out if you could make a C compiler that generated JVM code. I would run much closer to the C standard
than much C code does, especially regarding casting of pointers.
[So what did you conclude? I'd think C type casts would be hard to
turn into Java unless you made all of storage an opaque block. -John]
Someone else might have thought about the "opaque block" method.
But that wouldn't work if you wanted to call between Java and C.
As well as I know it, C only requires assignment to work for
pointers cast to (unsigned char *). And once they are cast,
usually (though I suppose not always), it is done with memcpy(),
or compared with memcmp().
Only unsigned char is 100% guaranteed, but on all known systems today
signed char has no trap rep and also works and so does plain char.
I didn't get as far as figuring out varargs functions, but someone
must have done that, as System.out.format() works.
You can call it with the usual different argument types,
and it figures out everything.
Java's System.out.format -- and Java's varargs in general -- works differently than C (at least C as practiced; the standard imposes
enough restrictions you probably _could_ implement it differently).
When Java calls a varargs method, the _caller_ silently creates an
array and fills it with the argument values, alll converted to the one
type specified in the definition (or compiled equivalent), and that
_array_ is actually passed along with the fixed args, in this case the
format string and possibly locale. For this case the one type is java.lang.Object, which is the top-type for all class _and_ array(1) instances in Java so they pass unchanged; any primitive value (int,
float, etc) is siliently converted to an instance of a builtin class (java.lang.Integer, java.lang.Float, etc) by 'autoboxing'. As a result
the format method(2) just matches format specifiers to elements of
that array (remember each Java array instance knows its own length so subscripting out of bounds traps).
Or more simply, Java varargs is sugar for a homogenous array.
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 59 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 25:29:03 |
| Calls: | 810 |
| Files: | 1,287 |
| Messages: | 196,020 |