Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 43 |
Nodes: | 6 (0 / 6) |
Uptime: | 94:22:24 |
Calls: | 290 |
Calls today: | 1 |
Files: | 904 |
Messages: | 76,378 |
Bart <bc@freeuk.com> writes:
It's another point of confusion. In my language I don't treat function
declarations like variable declarations. A function is not a
variable. There is no data storage associated with it.
In C, declarations can declare objects, functions, types, etc. I fail
to see how your language is relevant.
In C it is unfortunate, as it makes it hard to trivially distinguish a
function declaration (or the start of a function definition) from a
variable declaration.
It's not as hard as you insist on pretending it is. A function
declaration includes a pair of parentheses, either empty or
containing a list of parameters or parameter types.
Function declarations outside header files are valid, but tend to be
rare in well-written C code.
Bart <bc@freeuk.com> writes:
On 29/11/2024 20:35, Keith Thompson wrote:
(So it would have a different type from E declared on in the same
declaration:
int D[3][4][5], E;
? In that case tell that to David Brown!)
Yes, of course D and E have different types. I'm certain he's
aware of that.
What "range of types" do you think D can have?
Would you write "const int F();"? Or would you omit the "const"? How
does the fact that "const" is allowed inconvenience you?
It's another point of confusion. In my language I don't treat function
declarations like variable declarations. A function is not a
variable. There is no data storage associated with it.
In C, declarations can declare objects, functions, types, etc.
to see how your language is relevant.
In C it is unfortunate, as it makes it hard to trivially distinguish a
function declaration (or the start of a function definition) from a
variable declaration.
It's not as hard as you insist on pretending it is. A function
declaration includes a pair of parentheses, either empty or
containing a list of parameters or parameter types.
Function declarations outside header files are valid, but tend to be
rare in well-written C code.
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
Bart <bc@freeuk.com> writes:
[...]
The point is that there are restrictions on what can be combined into a single declaration. But these days it's usually considered good style
to declare only one identifier in each declaration, [...]
[...]
In my language [...],
variables declared in the same declaration have 100% the same type. If
they are even 1% different, then that is a separate type and they need
their own declarations. They are no gradations!
Bart <bc@freeuk.com> writes:
[...]
I can tell that in my syntax, function definitions start with a line
like this ([...] means optional; | separates choices):
['global'|'export'] 'func'|'proc' name ...
Which one do you think would be easier? (Function declarations are
generally not used.)
I don't care.
Yes, languages than C can have better declaration syntax than C does
(where "better" is clearly subjective). Perhaps yours does. [...]
On 30.11.2024 02:28, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
I can tell that in my syntax, function definitions start with a line
like this ([...] means optional; | separates choices):
['global'|'export'] 'func'|'proc' name ...
Which one do you think would be easier? (Function declarations are
generally not used.)
I don't care.
Yes, languages than C can have better declaration syntax than C does
(where "better" is clearly subjective). Perhaps yours does. [...]
From the various bits and pieces spread around I saw that Bart had
obviously adopted many syntactical elements of Algol 68, and I wonder
why he hadn't used just this language (or any "better" language than
"C") if he dislikes it so much that he even implemented own languages.
Bart <bc@freeuk.com> writes:
Perhaps you can post a trivial bit of C code which reads in C source
code and shows the first lines of all the function definitions, not
prototypes nor function pointers. It can assume that each starts at
the beginning of a line.
No. It's straightforward for an experienced C programmer looking at
code that's not deliberately obscure. A program that can do the same
thing reliably would have to include a C preprocessor and parser.
It's obvious that :
int foo(int);
is intended to be a function declaration,
I acknowledged elsewhere that I forgot about declarations of static functions. (Hundreds of function definitions in a single source file
seem unlikely.)
From the various bits and pieces spread around I saw that Bart had
obviously adopted many syntactical elements of Algol 68, and I wonder
why he hadn't used just this language (or any "better" language than
"C") if he dislikes it so much that he even implemented own languages.
But okay.
On 30/11/2024 01:28, Keith Thompson wrote:
It's obvious that :
int foo(int);
is intended to be a function declaration,
By itself? Sure. Within a large very busy source file, it'll get lost in
the noise. Where is the start of each function? I don't want to analyse
each line!
A week ago somebody on reddit posted a link to a C project. The source
code was unusual: it was 'clean' for a start, but also each function
started with:
func ....
Presumably that was some empty macro; I don't recall.
But an amazing thing happened: if viewed within my editor, I could
navigate between functions with PageUp and PageDown keys. That's never happened before with C.
On 29/11/2024 16:42, David Brown wrote:
On 29/11/2024 15:15, Michael S wrote:
On Fri, 29 Nov 2024 13:33:30 +0000
Bart <bc@freeuk.com> wrote:
* It allows a list of variable names in the same declaration to each
have their own modifiers, so each can be a totally different type >>>>
They can't have "totally different" types - they can have added
indirection or array indicators, following C's philosophy of
describing the type by how the variable is used:
int x, *y, z[10];
Thus "x", "*y" and "z[i]" are all of type "int".
C's syntax allows a 14-parameter function F to be declared in the same statement as a simple int 'i'.
I'd say that F and i are different types! (Actually I wouldn't even
consider F to be type, but a function.)
That F(1, 2, 3.0, "5", "six", seven, ...) might yield the same type as
'i' is irrelevant here.
Usually, given these declarations:
int A[100]
int *B;
int (*C)();
people would consider the types of A, B and C to be array, pointer and function pointer respectively. Otherwise, which of the 4 or 5 possible
types would you say that D has here:
int D[3][4][5];
It depends on how it is used in an expression, which can be any of &D,
D, D[i], D[i][j], D[i][j][k], none of which include 'Array' type!
Here's another puzzler:
const int F();
why is 'const' allowed here? There is no storage involved. It's not as
though you could write 'F = 0' is there was no 'const'.
C allows this, but I personally would be happier if it did not. As
Michael says below, most serious programmers don't write such code.
It doesn't matter. If you're implementing the language, you need to
allow it.
If trying to figure out why some people have trouble understanding, it's something to consider.
It's also something to keep in mind if trying to understand somebody
else's code: are they making use of that feature or not?
So this is a wider view that just dismissing design misfeatures just
because you personally won't use them.
With the kind of C I would write, you could discard everything after
C99, and even half of C99, because the subset I personally use is very conservative.
On 11/29/24 19:55, Waldek Hebisch wrote:
...
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
I wouldn't go so far as to say that it's poorly written, but I don't
like the unnecessary redundancy of that approach. Whenever possible, I
prefer to let each static function's definition serve as it's only declaration. This isn't possible, for instance, if you have a pair of mutually recursive functions.
The redundancy between a header file's function declaration and the corresponding function definition is necessary, given the way that C
works. Avoiding that is one of the reasons I like declaring static
functions, where appropriate.
IMHO, any way to mix more than one 'modifier' (not in C standard
meaning of the word, but in more general meaning) is potentially
confusing. It does not matter whether modifier is 'const' or '*'
or [] or ().
On 29/11/2024 19:26, Bart wrote:
C's syntax allows a 14-parameter function F to be declared in the same
statement as a simple int 'i'.
And the laws of physics allow me to drop a 20 kg dumbbell on my toe.
That does not mean that anyone thinks it is a good idea.
I'd say that F and i are different types! (Actually I wouldn't even
consider F to be type, but a function.)
Functions have types in most typed languages, including C.
And yes, F and i are different types - but they are related types. Use
the declared identifier in an expression of a form matching what you
wrote in the declaration, and the expression will have type "int".
That's how C's declarations work.
Really, most of this is pretty straightforward. No one is asking you to /like/ the rules of C's declarations (I personally dislike that a single declaration can be used for different types, even if they are related).
But /please/ stop pretending it's difficult to understand.
C allows this, but I personally would be happier if it did not. As
Michael says below, most serious programmers don't write such code.
It doesn't matter. If you're implementing the language, you need to
allow it.
I am not implementing the language. No one else here is implementing
it. You have, apparently, implemented at least some of the language
while being completely incapable of understanding it.
/That/ is how you solve problems with syntax that can be abused to write unclear code.
The C newbie will thank
you for the lesson, and move on to write C code without writing such
mixed declarations.
With the kind of C I would write, you could discard everything after
C99, and even half of C99, because the subset I personally use is very
conservative.
You say that as though you think it is a good thing - it is not.
On 11/29/24 22:25, Janis Papanagnou wrote:
...
From the various bits and pieces spread around I saw that Bart had
obviously adopted many syntactical elements of Algol 68, and I wonder
why he hadn't used just this language (or any "better" language than
"C") if he dislikes it so much that he even implemented own languages.
But okay.
No existing language meets Bart's needs as well as his own does.
attributes this to all of the other language designers being idiots for creating those languages, and to all the other languages' users being
idiots for not rejecting those languages.
He attributes this to all of the other language designers being idiots
for creating those languages
He refuses to accept the
possibility that his own preferences for language design might be
somewhat idiosyncratic.
If I write this
int *A, B[10], C(int);
However writing:
A; B; C;
creates expressions with types 'ref i32', 'ref i32', and 'ref
proc(i32)i32' according to C rules.
BTW here are A, B, C in my syntax:
ref i32 A
[10]i32 B
proc(i32)i32 C
(The last is a function declaration, which only exist for FFI functions;
it can only appear in an 'importdll' block.)
On 11/29/24 22:25, Janis Papanagnou wrote:
...
From the various bits and pieces spread around I saw that Bart had
obviously adopted many syntactical elements of Algol 68, and I wonder
why he hadn't used just this language (or any "better" language than
"C") if he dislikes it so much that he even implemented own languages.
But okay.
No existing language meets Bart's needs as well as his own does. He attributes this to all of the other language designers being idiots for creating those languages, and to all the other languages' users being
idiots for not rejecting those languages. He refuses to accept the possibility that his own preferences for language design might be
somewhat idiosyncratic.
For most people new to C, it's enough to tell them that "int* a, b;"
declares "a" as a "pointer to int" and "b" as an "int". You tell them
it is a bad idea to write such code, even re-arranged as "int *a, b;", because it is easy to get wrong - they should split the line into two declarations (preferably with initialisations). The C newbie will thank
you for the lesson, and move on to write C code without writing such
mixed declarations.
On 30/11/2024 03:25, Janis Papanagnou wrote:
From the various bits and pieces spread around I saw that Bart had
obviously adopted many syntactical elements of Algol 68, and I wonder
why he hadn't used just this language (or any "better" language than
"C") if he dislikes it so much that he even implemented own languages.
It needed to be a lower level language that could be practically
implemented on a then small machine.
Algol68 implementations were scarce especially on 8-bit systems.
But I also considered it too high level and hard to understand.
Even the
syntax had features I didn't like, like keyword stropping
and fiddly rules about semicolon placement.
As for better languages than C, there were very few at that level.
Even
C was not so practical: C compilers cost money (I wasn't a programmer,
my boss wouldn't pay for it!).
There would have been problems just getting it into the machine (since
on CP/M, every machine used its own disk format). And by the accounts I
read later on in old Byte magazine articles, C compilers were hopelessly
slow running on floppy disks. (Perhaps Turbo C excepted.)
By the time C might have been viable, I found that my language was preferable.
"int", "void" and "double" are totally different types in my view.
"int", "pointer to int", "array of int", "function returning int" all
have a relation that means I would not describe them as /totally/
different types - though I would obviously still call them /different/
types.
Function declarations outside header files are valid, but tend to be
rare in well-written C code.
A function definition - as typically written - is also a function declaration. So presumably you mean non-defining declaration here.
Some people have a style where they write forward declarations of all functions defined in a C file near the top of the file. I am not a fan
of that myself - especially as over time, this redundant information is rarely kept fully in sync with the rest of the code.
On 30.11.2024 12:59, Bart wrote:
But I also considered it too high level and hard to understand.
This I find astonishing, given that it is (IMO; and different from C)
a so cleanly defined language.
Even the
syntax had features I didn't like, like keyword stropping
Stropping was a way to solve the limited characters available in the
system character sets. Practically, as an implementer, you could use
any mechanism you like. (On the mainframe I had used symbols preceded
by a dot, the Genie compiler uses uppercase, for example. None is a
problem for the implementer.)
and fiddly rules about semicolon placement.
Huh? - The semicolon placement as delimiters is quite clear and (as so
many things in Algol 68) also clearly defined (IMO). - So what do you
have in mind here?
As for better languages than C, there were very few at that level.
(But you know you can use Algol 68 on a system development level; we
can read that it had been done at those day. - All that's "missing",
and that's a good design decision, were pointers.)
There would have been problems just getting it into the machine (since
on CP/M, every machine used its own disk format). And by the accounts I
read later on in old Byte magazine articles, C compilers were hopelessly
slow running on floppy disks. (Perhaps Turbo C excepted.)
(I don't get what argument you are trying to make. - That you wanted
some terse language, maybe, as you already said above?)
Bart <bc@freeuk.com> writes:
On 29/11/2024 20:35, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
C's syntax allows a 14-parameter function F to be declared in the same >>>> statement as a simple int 'i'.Yes (except that it's a declaration, not a statement) :
int i = 42, F(int, int, int, int, int, int, int,
int, int, int, int, int, int, int);
Are you under the impression that anyone here was not already aware
of
that? Would you prefer it if the number of parameters were arbitrarily
restricted to 13?
Do you think that anyone would actually write code like the above?
C generally doesn't impose arbitrary restrictions. Because of that,
it's possible to write absurd code like the declaration above. 99% of
programmers simply don't do that, so it's not a problem in practice.
I'd say that F and i are different types! (Actually I wouldn't evenNeither F nor i is a type. i is an object (of type int), and F is a
consider F to be type, but a function.)
function (of type int(int, int, int, int, int, int, int, int, int, int,
int, int, int, int)).
That F(1, 2, 3.0, "5", "six", seven, ...) might yield the same type as >>>> 'i' is irrelevant here.It's relevant to the syntax. i and F can be declared in the same
declaration only because the type of i and the return type of F happen
to be the same. If F returned void, i and F would have to be declared
separately.
Which, of course, is a good idea anyway.
You're posting repeatedly trying to convince everyone that C allows
ridiculous code. We already know that. You are wasting everyone's time >>> telling us something that we already know. Most of us just don't obsess >>> about it as much as you do. Most of us recognize that, however
convoluted C's declaration syntax might be, it cannot be fixed in a
language calling itself "C".
Most of us here are more interested in talking about C as it's
specified, and actually trying to understand it, than in complaining
about it.
Usually, given these declarations:No, the object D unambiguously has type int[3][4][5]
int A[100]
int *B;
int (*C)();
people would consider the types of A, B and C to be array, pointer and >>>> function pointer respectively. Otherwise, which of the 4 or 5 possible >>>> types would you say that D has here:
int D[3][4][5];
It depends on how it is used in an expression, which can be any of &D, >>>> D, D[i], D[i][j], D[i][j][k], none of which include 'Array' type!
(So it would have a different type from E declared on in the same
declaration:
int D[3][4][5], E;
? In that case tell that to David Brown!)
Yes, of course D and E have different types. I'm certain he's
aware of that.
I wrote that the object D is unambiguously of type int[3][4][5], and the expression D can be of the array type int[3][4][5] or of the pointer
type int(*)[3][4], depending on the context. Do you agree? Or do you
still claim that D can have any of "4 or 5 possible types"?
(Note that I'm not talking about the type of the expression D[i] or of
any other expression that includes D as a subexpression.)
You seem have missed the point of my post, which was a reply to
David's remark that 'they can't have totally different types' which
was in response to my saying that each variable in the same
declaration can 'be [of] a totally different type'.
David apparently has a different definition of "totally different types"
than you do. Since the standard doesn't define that phrase, I suggest
not wasting time arguing about it.
Given:
int D[3][4][5], E;
the object D is of type int[3][4][5], and E is of type int. Do you understand that?
If you wanted to change the type of D from int[3][4][5] to
double[3][4][5], you'd have to use two separate declarations.
Do you understand that? (Of course you do, but will you admit that
you understand it?)
I think that distinction is what David had in mind. double[3][4][5] and
int are "totally different types", but int[3][4][5] and int are not.
Entities of "totally different types" cannot be declared in a single declaration. You don't have to accept that meaning of the phrase (which
I find a bit vague), but it's clearly what David meant.
The point is that there are restrictions on what can be combined into a single declaration. But these days it's usually considered good style
to declare only one identifier in each declaration, so while this :
int i, *p;
is perfectly valid, and every C compiler must accept it, this :
int i;
int *p;
is preferred by most C programmers.
Do you understand that?
DB is assuming the type of the variable after it's been used in an
expression that is fully evaluated to yield its base type. So my
A[100] is used as A[i], and D[3][4][5] is used as D[i][j][k].
But of course they may be evaluated only partially, yielding a range
of types.
What "range of types" do you think D can have?
Would you write "const int F();"? Or would you omit the "const"? How
does the fact that "const" is allowed inconvenience you?
It's another point of confusion. In my language I don't treat function
declarations like variable declarations. A function is not a
variable. There is no data storage associated with it.
In C, declarations can declare objects, functions, types, etc. I fail
to see how your language is relevant.
In C it is unfortunate, as it makes it hard to trivially distinguish a
function declaration (or the start of a function definition) from a
variable declaration.
It's not as hard as you insist on pretending it is. A function
declaration includes a pair of parentheses, either empty or
containing a list of parameters or parameter types.
Function declarations outside header files are valid, but tend to be
rare in well-written C code.
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Bart <bc@freeuk.com> writes:
It's another point of confusion. In my language I don't treat function
declarations like variable declarations. A function is not a
variable. There is no data storage associated with it.
In C, declarations can declare objects, functions, types, etc. I fail
to see how your language is relevant.
In C it is unfortunate, as it makes it hard to trivially distinguish a
function declaration (or the start of a function definition) from a
variable declaration.
It's not as hard as you insist on pretending it is. A function
declaration includes a pair of parentheses, either empty or
containing a list of parameters or parameter types.
Function declarations outside header files are valid, but tend to be
rare in well-written C code.
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
On 30/11/2024 01:55, Waldek Hebisch wrote:
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
Without doubt, most functions (and non-local data) should be static.
However, IMHO writing (non-defining) declarations for your static
functions is a bad idea unless it is actually necessary to the code
because you are using them in function pointers or have particularly
good reasons for the way you order your code.
I don't find redundant declarations of static functions at all useful -
and I find them of significant cost in maintaining files. It is far too easy to forget to update them when you change, delete or add new
functions. And a list of such declarations that you don't feel you can trust entirely, is worse than useless.
Such lists might have been helpful to some people decades ago, when
editors were more primitive. If I need a list of functions in a file
(maybe it's someone else's code, or old code of mine), any programmer's editor or IDE will give me it - updated correctly in real-time, and not
out of sync.
On 01/12/2024 11:34, David Brown wrote:
"int", "void" and "double" are totally different types in my view.
"int", "pointer to int", "array of int", "function returning int" all
have a relation that means I would not describe them as /totally/
different types - though I would obviously still call them /different/
types.
What about 'array of int', 'array of double' and 'array of void*'; do
they have a relation too?
Your examples, expressed left-to-right, happen to share the last
element. They could also share other elements; so what?
Function declarations outside header files are valid, but tend to be
rare in well-written C code.
A function definition - as typically written - is also a function
declaration. So presumably you mean non-defining declaration here.
Some people have a style where they write forward declarations of all
functions defined in a C file near the top of the file. I am not a
fan of that myself - especially as over time, this redundant
information is rarely kept fully in sync with the rest of the code.
That's a separate problem. But without forward declarations, at some
point you're going to add some expression in the middle of the file, but
find you're calling a function which is declared later on in the file
rather than earlier.
That's not something you want to waste time thinking about.
On 30/11/2024 15:57, David Brown wrote:
On 29/11/2024 19:26, Bart wrote:
C's syntax allows a 14-parameter function F to be declared in the
same statement as a simple int 'i'.
And the laws of physics allow me to drop a 20 kg dumbbell on my toe.
That does not mean that anyone thinks it is a good idea.
Who said it's a good idea? I merely said that C allows such disparate
types in declarations. You disagree that they are different types, while
at the same time saying it's a bad idea to mix them in the same
declaration!
I'd say that F and i are different types! (Actually I wouldn't even
consider F to be type, but a function.)
Functions have types in most typed languages, including C.
And yes, F and i are different types - but they are related types.
Use the declared identifier in an expression of a form matching what
you wrote in the declaration, and the expression will have type "int".
That's how C's declarations work.
That's not how people's minds work.
If you declare A, B, and C, then
what is important is the types of A, B, and C, not what might yielded as
they result of some expression.
With the kind of C I would write, you could discard everything after
C99, and even half of C99, because the subset I personally use is
very conservative.
You say that as though you think it is a good thing - it is not.
Why?
I reckon people will have an easier type understanding and working with
my code than yours.
It will at least work with more compiles.
On 01/12/2024 09:36, Janis Papanagnou wrote:
On 30.11.2024 12:59, Bart wrote:
[About Algol68]
But I also considered it too high level and hard to understand.
This I find astonishing, given that it is (IMO; and different from C)
a so cleanly defined language.
Algol68 was famous for its impenetrable specification. Its Revised
Report was the programming language equivalent of James Joyce's 'Ulysses'.
I needed a clean simple syntax and 100% obvious and explicit semantics.
[ Stropping ]
Yes, but they made writing, reading and maintaining source code
impossible. [...]
[...]
If I really need to use a reserved word as an identifier now [...]
[ snip examples of "Bart's language" ]
and fiddly rules about semicolon placement.
Huh? - The semicolon placement as delimiters is quite clear and (as so
many things in Algol 68) also clearly defined (IMO). - So what do you
have in mind here?
It just makes life harder. It special-cases the last statement of any
block, which must be semicolon free, as it's strictly a separator. So:
* Adding a new statement to the end of a block, you must apply ; to the
current last statement
* Deleting the last line, you must delete the ; on the previous.
* Move any of the lines about, and you may again need to update the semicolons if the last was included
* Temporarily comment out lines including the last, you must also
temporarily remove ; from the line before the comments
* Copy the whole block elsewhere, you might need to add ;
* Temporarily comment out a whole block (or start off with an empty
block that will be populated later) you need to use SKIP, another
annoyance.
Usually you're not aware of this until the compiler tells you and you
have to go back in and fix it.
Allow semicolons to be a /terminator/, and all that goes away. It's a no brainer.
But then I don't like having to write semicolons at all, and
generally I don't.
The whole thing with stropping and semicolons is just a colossal
time-waster.
As for better languages than C, there were very few at that level.
(But you know you can use Algol 68 on a system development level; we
can read that it had been done at those day. - All that's "missing",
and that's a good design decision, were pointers.)
[ re-iterated speed argument in comparison with "own" languages
while completely neglecting the other factors (including speed
of development process) snipped ]
It's quite unsuited to systems programming, and not just because of its execution speed. However, I'd quite like to see A68G implemented in A68G!
Algol68 was a fascinating and refreshing language back then. It looked
great when typeset in a book. But its practicalities were annoying, and
now it is quite dated.
[...]
On 30/11/2024 01:55, Waldek Hebisch wrote:
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
Without doubt, most functions (and non-local data) should be static.
On 01/12/2024 13:50, David Brown wrote:
On 30/11/2024 01:55, Waldek Hebisch wrote:
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
Without doubt, most functions (and non-local data) should be static.
I have a tool that translates C programs to my syntax. Most functions of codebases I tried are marked 'global', because the C version did not use 'static'.
Generally those functions don't need to be exported. This is just
laziness or ignorance on the part of the program, not helped by C using
the wrong default.
However, IMHO writing (non-defining) declarations for your static
functions is a bad idea unless it is actually necessary to the code
because you are using them in function pointers or have particularly
good reasons for the way you order your code.
A good reason might be NOT CARING how the code is ordered.
I don't find redundant declarations of static functions at all useful
- and I find them of significant cost in maintaining files. It is far
too easy to forget to update them when you change, delete or add new
functions. And a list of such declarations that you don't feel you
can trust entirely, is worse than useless.
Why doesn't the compiler report a declaration that doen't match the definition?
Such lists might have been helpful to some people decades ago, when
editors were more primitive. If I need a list of functions in a file
(maybe it's someone else's code, or old code of mine), any
programmer's editor or IDE will give me it - updated correctly in
real-time, and not out of sync.
Why isn't this a problem for exported/shared functions?
That is, for all sorts of functions and variables declared in headers
where there is a declaration in header, and a definition in some 'home' module.
Michael S <already5chosen@yahoo.com> writes:
IMHO, any way to mix more than one 'modifier' (not in C standard
meaning of the word, but in more general meaning) is potentially
confusing. It does not matter whether modifier is 'const' or '*'
or [] or ().
It surprises me that you would say this. Certainly there are type
forms that might be difficult to absorb (e.g., 'float *********')
but that doesn't mean they are necessarily confusing. There are two
obvious ways to write type forms that are easy to decode. One way
is to write any derivings right-to-left:
[] * (double,double) * float
which can be read directly as "array of pointer to function that
returns a pointer to float", and the other way is simply the reversal
of that:
float * (double,double) * []
which can be read right-to-left the same way. The constructors for
derived types (pointer, array, function) act like nouns. Qualifiers
such as const or volatile act like adjectives and always go to the
left of the noun they modify, so for example
[] volatile* float
is an array of volatile pointer to float, or in the other ordering
float volatile* []
which is simply a reversal of noun phrases, with any modifying
adjectives staying on the left side of the noun they modify.
The syntax used in C is harder to read for two reasons: one, the
ordering of derivations is both left-to-right and right-to-left,
depending on what derivation is being applied; and two, any
identifier being declared goes in the middle of the type rather
than at one of the ends. Both of those confusions can be removed
simply by using a consistent ordering, either left-to-right or
right-to-left (with qualifying adjectives always on the left of
the noun they modify).
Note that both of the consistent orderings correspond directly to a
natural English wording, which accounts for them being easier to
comprehend than C-style type forms. (I conjecture that some foreign languages might not have that property, but since I am for the most
part ignorant of essentially all natural languages other than
English I have no more to say about that.)
On 01.12.2024 12:52, Bart wrote:
Yes, but they made writing, reading and maintaining source code
impossible. [...]
Really? - For me it's exactly the opposite; having the keywords stand
out lexically (or graphically) is what adds to legibility!
(I admit that hitting the Shift or the Caps-Lock key may be considered cumbersome by [some/most] people. - I "pay the price" for legibility.)
[ snip examples of "Bart's language" ]
(It makes no sense to compare Algol 68 with "your language"
that with me. - I understood that you find it a good idea to implement
an own [irrelevant] language
(You may be a
candidate for using an IDE that alleviates you from such mundane
tasks.)
With your argumentation I'm curious what you think about having
to add a semicolon in "C" if you replace a {...} block.
Or, in the first place, what you think about semicolons in "C"s
'if-else' construct (with parenthesis-blocks or single statements).
And what's actually the "statement" in 'if(b)s;' and 'else s;'
and what you think about 'if(b){}else{}' being a statement (or
not, since it's lacking a semicolon).
(This is obviously an issue you have; not the language. You should
have better written "Usually I'm not aware of this ...". And that's
of course a fair point [for you].)
Allow semicolons to be a /terminator/, and all that goes away. It's a no
brainer.
History and also facts of contemporary languages disagree with you.
(Re: "no brainer": You need a brain to understand or know that, of
course. - So my suggestion to you is obvious; inform yourself.)
For example I find it a "colossal time-waster" to write an own
language given the many different existing ones
- some even available
in source code to continue working on an existing code base. Colossal
is here a really perfect chosen adjective. - Your scale seems to have
got impaired; you spot marginal time "wastes" and miss the real ones, qualitatively and quantitatively.)
It's quite unsuited to systems programming, and not just because of its
execution speed. However, I'd quite like to see A68G implemented in A68G!
I've heard and read, as I said, a differing thing about that.
Specifically I recall to have read about that special topic you
mention of writing an Algol 68 compiler in Algol 68; it has been
done.
(Your personal preferences and enthusiasm should not get in the way
of either checking the facts or formulate your opinions/thoughts as
what they are, here basically wrong assumptions based on ignorance.)
It makes me smile if you speak about "looking great when typeset",
given that the languages we use nowadays, specifically (e.g.) "C",
C++, don't even look good "when typeset".
And the problems you/we
buy with that are directly observable in the languages. Rather we
seem to have accepted all their deficiencies and just work through
(or around) them. Most do that with not complaints. What I find
astonishing is that you - here known to complain about a lot of "C"
details - are now praising things (and at the same time despise
sensible concepts in an exceptionally well designed language as
Algol 68).
I can see some advantages in a language being happy with any order of function definition, without requiring forward declarations to use a
function before it is defined. But C is not like that, and I cannot
honestly say it bothers me one way or the other. And apparently, it
does not particularly bother many people - there is, I think, no
serious impediment or backwards compatibility issue that would
prevent C being changed in this way. Yet no one has felt the need
for it - at least not strongly enough to fight for it going in the
standard or being a common compiler extension.
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any order of
function definition, without requiring forward declarations to use a
function before it is defined. But C is not like that, and I cannot
honestly say it bothers me one way or the other. And apparently, it
does not particularly bother many people - there is, I think, no
serious impediment or backwards compatibility issue that would
prevent C being changed in this way. Yet no one has felt the need
for it - at least not strongly enough to fight for it going in the
standard or being a common compiler extension.
I think, arguing in favor of such change would be easier on top of
the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more categorical
claim that there are no new issues.
On 01/12/2024 15:23, Bart wrote:
Such lists might have been helpful to some people decades ago, when
editors were more primitive. If I need a list of functions in a file
(maybe it's someone else's code, or old code of mine), any
programmer's editor or IDE will give me it - updated correctly in
real-time, and not out of sync.
Why isn't this a problem for exported/shared functions?
That is, for all sorts of functions and variables declared in headers
where there is a declaration in header, and a definition in some
'home' module.
What do you mean here?
I certainly consider it a weakness in C that you don't have clear requirements and limitations for what can be in a header or a C file, or
how things can be mixed and matched. Keeping code clear and
well-ordered therefore requires discipline and standardised arrangement
of code and declarations. Different kinds of projects will have
different requirements here, but for my own code I find it best to be
strict that for any C file "file.c", there will be a header "file.h"
which contains "extern" declarations of any exported functions or data,
along with any type declarations needed to support these. My tools will warn on any mismatches, such as non-static functions without a matching "extern" declaration. They can't catch everything - the way C is built
up, there is no distinction between external declarations that should be defined in the same module and ones that are imported from elsewhere.
David Brown <david.brown@hesbynett.no> writes:
On 30/11/2024 00:44, Keith Thompson wrote:
A function definition - as typically written - is also a function
declaration. So presumably you mean non-defining declaration here.
Yes.
Some people have a style where they write forward declarations of all
functions defined in a C file near the top of the file. I am not a
fan of that myself - especially as over time, this redundant
information is rarely kept fully in sync with the rest of the code.
But it is definitely something you'll sometimes see in real-world C
code. (You could argue that the code is then not "well-written" C,
but that would be a very subjective opinion.)
Yes, that was an oversight on my part.
If someone wanted to ensure that all static functions defined in a translation unit are declared near the top, there could be a separate
tool to generate, or at least check, the declarations. I'm not aware of
any such tool, which suggests there probably isn't much demand for it.
On 01/12/2024 15:50, David Brown wrote:
On 01/12/2024 15:23, Bart wrote:
Such lists might have been helpful to some people decades ago, when
editors were more primitive. If I need a list of functions in a
file (maybe it's someone else's code, or old code of mine), any
programmer's editor or IDE will give me it - updated correctly in
real-time, and not out of sync.
Why isn't this a problem for exported/shared functions?
That is, for all sorts of functions and variables declared in headers
where there is a declaration in header, and a definition in some
'home' module.
What do you mean here?
You said you didn't want a list of declarations to maintain for static functions within a module.
But for non-static functions, which are shared via a header, you /need/
such a list to be maintained:
prog.h: int F(int);
prog.c: #include "prog.h"
static int G(int a);
int F(int a) {return 0;}
static int G(int a) {return 0;}
Here, you object to having to maintain the declaration for G, but you
still need to do so for F, and inside a separate file.
The declaration for F could also get out of sync, but you don't consider
that a problem?
And if it isn't because your tools help with this, then they can help
with G too.
I certainly consider it a weakness in C that you don't have clear
requirements and limitations for what can be in a header or a C file,
or how things can be mixed and matched. Keeping code clear and
well-ordered therefore requires discipline and standardised
arrangement of code and declarations. Different kinds of projects
will have different requirements here, but for my own code I find it
best to be strict that for any C file "file.c", there will be a header
"file.h" which contains "extern" declarations of any exported
functions or data, along with any type declarations needed to support
these. My tools will warn on any mismatches, such as non-static
functions without a matching "extern" declaration. They can't catch
everything - the way C is built up, there is no distinction between
external declarations that should be defined in the same module and
ones that are imported from elsewhere.
Yes, this is why a module scheme (such as the kind I use) is invaluable.
In the example above, you'd define both F and G in one place. There is
no header and there are no separate declarations.
If another module wishes to use F, then it imports the whole module that defines F.
Some schemes can selectively import individual functions, but to me
that's pointless micro-managing.
In my scheme, it is not even necessary for individual modules to
explicitly import each other: a simple list of modules is provided in
one place, and they will automatically import each others' exported
entities (which include functions, variables, types, enums, structs,
named constants, and macros).
On 01/12/2024 21:12, Bart wrote:
Yes, this is why a module scheme (such as the kind I use) is invaluable.
Agreed. C does not have a real module scheme as such. But it supports getting similar effects - you just have to be disciplined in the way you write your headers. This has the disadvantage of being less consistent than, say, Pascal or Modula 2, especially if the programmer is not disciplined. And it has the advantage in flexibility - I have a scheme
that I like and that works well for the kind of code I work with, but
other people prefer other schemes. It's easy to fall into the trap of
"my way is the right way", especially when you make your own language
and you are the only user, but there is always a balance to be sought
between consistency and flexibility.
In the example above, you'd define both F and G in one place. There is
no header and there are no separate declarations.
If another module wishes to use F, then it imports the whole module
that defines F.
Some schemes can selectively import individual functions, but to me
that's pointless micro-managing.
To me, it is absolutely vital that the importing unit can only see the identifiers that were explicitly exported.
It is also absolutely vital
(and this is a critical missing feature for C - and a good reason to
switch to C++ even if you use no other feature of that language) that
the imported identifiers be in their own namespace so that they do not conflict with identifiers in the importing unit. If the language
provides a feature for importing the external identifiers directly into
the current unit's namespace, then it has to allow selective import of identifiers - otherwise all concepts of scalability and modularity go
out the window.
In my scheme, it is not even necessary for individual modules to
explicitly import each other: a simple list of modules is provided in
one place, and they will automatically import each others' exported
entities (which include functions, variables, types, enums, structs,
named constants, and macros).
That sounds, frankly, utterly terrible for anyone who worked with other people.
On 02/12/2024 10:30, David Brown wrote:
On 01/12/2024 21:12, Bart wrote:
Yes, this is why a module scheme (such as the kind I use) is invaluable. >>>
Agreed. C does not have a real module scheme as such. But it
supports getting similar effects - you just have to be disciplined in
the way you write your headers. This has the disadvantage of being
less consistent than, say, Pascal or Modula 2, especially if the
programmer is not disciplined. And it has the advantage in
flexibility - I have a scheme that I like and that works well for the
kind of code I work with, but other people prefer other schemes. It's
easy to fall into the trap of "my way is the right way", especially
when you make your own language and you are the only user, but there
is always a balance to be sought between consistency and flexibility.
In the example above, you'd define both F and G in one place. There
is no header and there are no separate declarations.
If another module wishes to use F, then it imports the whole module
that defines F.
Some schemes can selectively import individual functions, but to me
that's pointless micro-managing.
To me, it is absolutely vital that the importing unit can only see the
identifiers that were explicitly exported.
Speaking for my scheme, that is exactly what happens.
(First, I should say that my programs - sets of source files that will comprise one binary - are organised into 'subprograms', each of which is
a chummy collection of modules that effectively import each other. The following is about one subprogram.)
Only entities marked with 'global' are visble from other modules. But if module X exports names A, B, C, all will be visible from Y.
Further, exported names D, E, F from X will be visible from X. Imports
can be circular (but subprograms are hierarchical).
What I object to in other schemes are:
* Where each of dozens of modules contains a ragtag list of imports at
the top. These look untidy and need to be endlessly maintained as a
fresh imported function is needed from module not yet listed, or an
import needs to be deleted as references to it are dropped. (This used
to be my scheme too!)
* Where functions are selectively imported from each module. FGS!
Instead of a few dozen imports, now there could be hundreds of lines of imported function names to maintain. You'd have time for nothing else!
It is also absolutely vital (and this is a critical missing feature
for C - and a good reason to switch to C++ even if you use no other
feature of that language) that the imported identifiers be in their
own namespace so that they do not conflict with identifiers in the
importing unit. If the language provides a feature for importing the
external identifiers directly into the current unit's namespace, then
it has to allow selective import of identifiers - otherwise all
concepts of scalability and modularity go out the window.
Each of my modules creates a namespace. (Also, each of my subprograms
creates one namespace for all entities exported from the whole library:
that requires 'export' rather than 'global'.
However that namespace is rarely used. If this module imports X which
exports function F, then I can write F() instead of X.F().
I only need to use the namespace if:
* This module imports two modules that both export F so there is an
ambiguity (this is reported)
* This module has its own F so it shadows any imported versions. This is
not reported, but has the issue that, if a local F is freshly created,
it can silently shadow the previously imported F().
In my scheme, it is not even necessary for individual modules to
explicitly import each other: a simple list of modules is provided in
one place, and they will automatically import each others' exported
entities (which include functions, variables, types, enums, structs,
named constants, and macros).
That sounds, frankly, utterly terrible for anyone who worked with
other people.
You've never used my scheme.
One significant advantage is that because
all modules (and subprogram imports) are listed in the lead module
(usually that's all it contains), it is very easy to build a different configuration using an alternative lead module with a different collection.
Alternatively, different modules can be commented in or out, in one
place. Below is the lead module of my C compiler, what is submitted to
my main compiler. No other modules contain any project info (only pcl.m,
a separate subprogram/library).
The only thing I haven't yet figured out is how the compiler knows the location of an imported library which may reside elsewhere in the file system. For now this is hardcoded.
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
But I want my software maintainable and readable. So my experience
is that I want some lexical "accentuation"; common answers to that
are for identifiers (for example) Camel-Case (that I used in C++), >underscores (that I use in Unix shell, Awk, etc.), or spaces (like
in Algol 68, but which is practically irrelevant for me).
it's not fussy about semicolons,
From the languages I know of in detail and I'm experienced in none
is "fussy" about semicolons. Rather it's a simple and well designed >syntactical token, whether used as separator or terminator. You've
just to put it where it's defined.
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
Yes, but they made writing, reading and maintaining source code
impossible. [...]
Really? - For me it's exactly the opposite; having the keywords stand
out lexically (or graphically) is what adds to legibility!
In my syntax, you can write keywords in capitals if you want. It's case-insensitive! People using my scripting language liked to capitalise them. But now colour-highlighing is widely used.
(I admit that hitting the Shift or the Caps-Lock key may be considered
cumbersome by [some/most] people. - I "pay the price" for legibility.)
There's a lot of Shift and Caps-Lock with writing C or C-style syntax.
[...]
The point is, these are exceptions; Algol68 requires every reserved
word, which includes types names, to be stropped. It gives a very
peculiar look to source code, which you see very rarely in other languages.
that with me. - I understood that you find it a good idea to implement
an own [irrelevant] language
You keep saying that. It's a real language and has been tried and tested
over decades. Maybe it would be better if I'd just made up hypothetical features and posted about ideas?
(You may be a
candidate for using an IDE that alleviates you from such mundane
tasks.)
I use a syntax that alleviates me from that!
Many languages allow trailing commas in multi-line lists. The reason is EXACTLY to simplify maintenance. But you're suggesting it is only me who
has such a problem with this stuff. Obviously others do as well.
With your argumentation I'm curious what you think about having
to add a semicolon in "C" if you replace a {...} block.
That's just more fun and games. I don't get the rules there either.
Sometimes "};" is needed; sometimes it's not needed but is harmless; sometimes it can cause an error.
Or, in the first place, what you think about semicolons in "C"s
'if-else' construct (with parenthesis-blocks or single statements).
And what's actually the "statement" in 'if(b)s;' and 'else s;'
and what you think about 'if(b){}else{}' being a statement (or
not, since it's lacking a semicolon).
That's something else that Algol68 fixed, and which other languages have copied (Lua for one).
[ snip personal problems with writing Algol 68 programs ]
[...]
- some even available
in source code to continue working on an existing code base. Colossal
is here a really perfect chosen adjective. - Your scale seems to have
got impaired; you spot marginal time "wastes" and miss the real ones,
qualitatively and quantitatively.)
I put a lot of weight on syntax; obviously you don't.
My syntax
makes typing easier because it is case-insensitive,
there is considerably less punctuation,
it's not fussy about semicolons,
it allows type-sharing more,
it doesn't need separate declarations,
or headers, or ....
The end result is that less text needs to be typed, source looks cleaner
and it's less error prone. I don't need to write:
for (int index = 0; index < N; ++index)
for example. Or, to share a named entity, I don't need to write two
versions of it, one here and the other in a shared header. You don't
think that is a good thing?
So what bad language features do you think are time-wasters that I
should instead look at?
It's quite unsuited to systems programming, and not just because of its
execution speed. However, I'd quite like to see A68G implemented in
A68G!
I've heard and read, as I said, a differing thing about that.
Specifically I recall to have read about that special topic you
mention of writing an Algol 68 compiler in Algol 68; it has been
done.
I'm sure it has. My point about A68G is that it is interpreter, a fairly
slow one.
So how fast would A68 code run under an interpreter running
under A68G?
(Your personal preferences and enthusiasm should not get in the way
of either checking the facts or formulate your opinions/thoughts as
what they are, here basically wrong assumptions based on ignorance.)
Really? I've written countless compilers and interpreters. Mainly I
devised systems programming languages. You think I don't know my field?
IMO A68 is unsuitable for such things, and A68G doubly so.
It makes me smile if you speak about "looking great when typeset",
given that the languages we use nowadays, specifically (e.g.) "C",
C++, don't even look good "when typeset".
Yeah. The first time I saw C code was in K&R1, in a book I bought in
1982 (for £12; a lot of money). It looked dreadful. The typeface used
made it look anaemic. That really put me off, more than the practical problems.
[...]
[...]
I admire languages that adapt and evolve. Fortran for example. C adapted poorly and slowly. Algol68 apparently hasn't evolved at all. I guess it couldn't do without changing it's RR, a big undertaking.
Which means it's stuck in the 1960s with some dated design choices.
BTW below is an actual example of Algol68 for A68G. It shows various
issues, other than syntax (but notice those jarring ";" after the END of
each function).
You can't mix signed/unsigned arithmetic easily, it
needs BITS, which are awkward to initialise.
It is really dreadful. It makes writing in C attractive!
[ Low-level code sample in Algol 68 vs. Bart's language snipped ]
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files. [...]
On 02/12/2024 13:24, Bart wrote:
I would consider any kind of automatic import system to be a bad idea.
I like structured and modular programming. It is strange to me that
someone would have the tools for that, and then choose not to use them.
But some people seem to prefer piling everything together in one name
space.
Indeed - that's why I say it /sounds/ terrible to me. As I said above,
I prefer a clear and modular organisation. If I am writing module "foo"
and a colleague is writing module "bar" as part of the same program, the
last thing we want is automatic import of each other's modules, symbols, functions, etc. - especially not jumbled into the main namespace for the code!
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files. [...]
A suspicion that I got as well (also from other posts of him).
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
But I want my software maintainable and readable. So my experience
is that I want some lexical "accentuation"; common answers to that
are for identifiers (for example) Camel-Case (that I used in C++),
underscores (that I use in Unix shell, Awk, etc.), or spaces (like
in Algol 68, but which is practically irrelevant for me).
CamelCase reduced typing speed and adds little benefit when compared
with the alternatives (rational abbreviations, or even underscores).
On 02.12.2024 19:13, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
Can you give some hints (or keywords) to be able to better understand
your opinion?
I cannot follow your speed argument - for speed questions ask Bart ;-)
For camel-case I need just the Shift-key, and underscores requires
me to type an extra character, the underscore (which also needs the >Shift-key), so the latter should be slower (and [for me] is slower).
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files. [...]
A suspicion that I got as well (also from other posts of him).
On 02/12/2024 15:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
I would consider any kind of automatic import system to be a bad idea.
It's not automatic; you have to list the modules. If you don't want data shared, then don't export it!
If you want selected imports between arbitrary module subsets, then my
scheme has 'subprograms' (see below), or you choose a language with a
more chaotic module scheme that can give that flexibility, but requires
more maintenance.
I like structured and modular programming. It is strange to me that
someone would have the tools for that, and then choose not to use
them. But some people seem to prefer piling everything together in one
name space.
This point was about individual functions not files. For example this in Python:
from math import sqrt
(which also has the effect of making 'sqrt' part of this module's
namespace).
Indeed - that's why I say it /sounds/ terrible to me. As I said
above, I prefer a clear and modular organisation. If I am writing
module "foo" and a colleague is writing module "bar" as part of the
same program, the last thing we want is automatic import of each
other's modules, symbols, functions, etc. - especially not jumbled
into the main namespace for the code!
If you don't then you're not really writing the same part of the
program. I mentioned my programs comprise subprograms, which have
stricter import rules and which are non-hierarchical/non-circular.
It sounds like, in my scheme, you'd be working on different subprograms:
each team member has their own chummy collection of modules.
But it's possible you are also working on the same collection. Then you
talk!
C for example has one giant namespace anyway that contains all shared entities. /Then/ you have a problem, especially when people avoid 'static'.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files. [...]
A suspicion that I got as well (also from other posts of him).
I believe he (Bart) claimed that he keeps everything in one directory.
On 02/12/2024 18:23, Janis Papanagnou wrote:
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files. [...]
A suspicion that I got as well (also from other posts of him).
Then you'd both be wrong. DB's remark was anyway a misunderstanding.
My projects have all relevant modules, usually measured in dozens, in
the same folder.
On 02/12/2024 20:12, Bart wrote:
On 02/12/2024 18:23, Janis Papanagnou wrote:
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:A suspicion that I got as well (also from other posts of him).
[...]
That sounds like you have a very poor organisation of your files. [...] >>>
Then you'd both be wrong. DB's remark was anyway a misunderstanding.
My projects have all relevant modules, usually measured in dozens, in
the same folder.
Oh, dozens of modules! I never realised your programs were that big!
In my current project, there are 155 C files, 44 C++ files, 272 header
files, and 5 linker files over 71 directories. Most of the C files and
a substantial proportion of the headers are libraries and SDKs for the
device in use, most of the C++ files were written by me, but some were >written by four other developers at the same time.
When you work with more serious projects, you need a better organisation
than you do for little one-man hobby programs.
On 02/12/2024 20:12, Bart wrote:
On 02/12/2024 18:23, Janis Papanagnou wrote:
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:A suspicion that I got as well (also from other posts of him).
[...]
That sounds like you have a very poor organisation of your files. [...] >>>
Then you'd both be wrong. DB's remark was anyway a misunderstanding.
My projects have all relevant modules, usually measured in dozens, in
the same folder.
Oh, dozens of modules! I never realised your programs were that big!
In my current project, there are 155 C files, 44 C++ files, 272 header
files, and 5 linker files over 71 directories. Most of the C files and
a substantial proportion of the headers are libraries and SDKs for the
device in use, most of the C++ files were written by me, but some were written by four other developers at the same time.
When you work with more serious projects, you need a better organisation
than you do for little one-man hobby programs.
On 02/12/2024 19:27, Bart wrote:
On 02/12/2024 15:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
I would consider any kind of automatic import system to be a bad idea.
It's not automatic; you have to list the modules. If you don't want
data shared, then don't export it!
As I understood it, or perhaps failed to understand it (after all, it's
not as if there are published specifications, references,
tutorials or
examples for your language)
you would list your modules for a program
in one place, and then other modules in the program automatically import
them all.
from math import sqrt
(which also has the effect of making 'sqrt' part of this module's
namespace).
This is sometimes convenient for functions that you are going to use a
lot. At other times, you use "import math" and then refer to "math.sqrt".
Perhaps you simply don't know how teams might work.
On 02/12/2024 21:06, David Brown wrote:
Perhaps you simply don't know how teams might work.
Perhaps not. So? Not everyone works in a team.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
But I want my software maintainable and readable. So my experience
is that I want some lexical "accentuation"; common answers to that
are for identifiers (for example) Camel-Case (that I used in C++),
underscores (that I use in Unix shell, Awk, etc.), or spaces (like
in Algol 68, but which is practically irrelevant for me).
CamelCase reduced typing speed and adds little benefit when compared
with the alternatives (rational abbreviations, or even underscores).
If I write this
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable with type 'ref i32' (expressed in other syntax)
B is a local variable with type '[10]i32'
C is a function with return type of 'i32', taking one unnamed
parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same declaration can also create names in different scopes!)
Bart <bc@freeuk.com> writes:
...
If I write this
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable with type 'ref i32' (expressed in other syntax)
B is a local variable with type '[10]i32'
C is a function with return type of 'i32', taking one unnamed
parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same declaration can >> also create names in different scopes!)
A small correction: that declaration gives all three names the same
scope[1].
You are confusing scope with linkage.
On 02/12/2024 18:23, Janis Papanagnou wrote:
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files. [...]
A suspicion that I got as well (also from other posts of him).
Then you'd both be wrong. [...]
On 02/12/2024 21:16, David Brown wrote:
On 02/12/2024 20:12, Bart wrote:
On 02/12/2024 18:23, Janis Papanagnou wrote:
On 02.12.2024 16:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
[...]
That sounds like you have a very poor organisation of your files.
[...]
A suspicion that I got as well (also from other posts of him).
Then you'd both be wrong. DB's remark was anyway a misunderstanding.
My projects have all relevant modules, usually measured in dozens, in
the same folder.
Oh, dozens of modules! I never realised your programs were that big!
In my current project, there are 155 C files, 44 C++ files, 272 header
files, and 5 linker files over 71 directories. Most of the C files
and a substantial proportion of the headers are libraries and SDKs for
the device in use, most of the C++ files were written by me, but some
were written by four other developers at the same time.
When you work with more serious projects, you need a better
organisation than you do for little one-man hobby programs.
So, how would you have organised the 16-module example I posted
elsewhere? (Not a C project, these are 16 source files, so no headers etc.)
Because two posters here have suggested my organisation is poor, but
without knowing how big, small, or complex my projects are.
BTW your project doesn't sound that big, especially if you have a
penchant for having a larger number of smaller files.
A line-count would give a better idea.
The largest C project I attempted to build (with my compiler) was the
Seed7 language a few years ago. I think there were 130-odd .c files,
probably a comparable number of header files.
It took my compiler 1-2 seconds to build from scratch, producing I think
a 1.6MB executable. (That project is notable for coming with nearly 20 different makefiles, tuned for different compilers and platforms.)
The current version includes 176 .c files and 148 .h files (not all used
in any one configuration), which are all kept in one ./src folder. About 190Kloc in all.
I guess that's poorly organised too? It sounds like everybody else's
projects are!
scott@slp53.sl.home (Scott Lurndal) writes:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
In my view case-insensitive matching/lookup is clearly worse than case-sensitive matching. There may be some contexts where a
case-insensitive rule is tolerable or even preferable, but offhand
I'm not thinking of one. Of course sometimes I do want matching
to allow either case, for which 'grep -i' or some other common
tool solves the problem; the key is that it's my choice, not
a fixed choice imposed by a procrustean software system.
But I want my software maintainable and readable. So my experience
is that I want some lexical "accentuation"; common answers to that
are for identifiers (for example) Camel-Case (that I used in C++),
underscores (that I use in Unix shell, Awk, etc.), or spaces (like
in Algol 68, but which is practically irrelevant for me).
CamelCase reduced typing speed and adds little benefit when compared
with the alternatives (rational abbreviations, or even underscores).
My complaint about CamelCase (or camelCase, which I put in the same
category) is that my eyes have to work quite a bit harder compared
to text using underscores between words. Reading either form of
camelCase is slower, and also requires more mental effort, relative
to using underscores. Exception: CamelCase for a short noun phrase
(up to perhaps three or four words) seems to work well for type
names, probably because I can recognize the phrase as a whole
without needing (most of the time) to look at the individual words.
That property does not hold for names of variables or functions.
For the most part I don't use abbreviations in the usual sense of
the word, although I do sometimes use short non-words in a small
local context (here "short" means usually one or two letters, and
never more than four or five).
In my view case-insensitive matching/lookup is clearly worse than case-sensitive matching. There may be some contexts where a
case-insensitive rule is tolerable or even preferable, but offhand
I'm not thinking of one. Of course sometimes I do want matching
to allow either case, for which 'grep -i' or some other common
tool solves the problem; the key is that it's my choice, not
a fixed choice imposed by a procrustean software system.
My complaint about CamelCase (or camelCase, which I put in the same
category) is that my eyes have to work quite a bit harder compared
to text using underscores between words. Reading either form of
camelCase is slower, and also requires more mental effort, relative
to using underscores.
Exception: CamelCase for a short noun phrase
(up to perhaps three or four words) seems to work well for type
names, probably because I can recognize the phrase as a whole
without needing (most of the time) to look at the individual words.
That property does not hold for names of variables or functions.
For the most part I don't use abbreviations in the usual sense of
the word, although I do sometimes use short non-words in a small
local context (here "short" means usually one or two letters, and
never more than four or five).
On 03/12/2024 11:15, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
...
If I write this
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable with type 'ref i32' (expressed in other syntax) >>> B is a local variable with type '[10]i32'
C is a function with return type of 'i32', taking one unnamed
parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same
declaration can
also create names in different scopes!)
A small correction: that declaration gives all three names the same
scope[1].
This is what I observed my compiler doing, because it displays the
symbol table. It puts C into module-scope, so I can access it also from another function without another declaration (so non-conforming, but I'm
long past caring).
I can't see gcc's symbol table, but I can't access C from another
function without it having its own declaration, or there being a
module-scope one.
With gcc, such a declaration inside a function suffices to be able
access a function 'C' defined later in the file.
You are confusing scope with linkage.
It's possible. So a function declaration inside a function gives the
name external linkage (?).
Which in this context means the function will
be outside this one, but elsewhere in the module, rather than being
imported from elsewhere module.
If I say I find these quirks of C confusing, people will have another go
at me. So let's say it makes perfect sense for 'extern' to mean two
different things!
On 03/12/2024 02:23, Tim Rentsch wrote:
scott@slp53.sl.home (Scott Lurndal) writes:
For the most part I don't use abbreviations in the usual sense of
the word, although I do sometimes use short non-words in a small
local context (here "short" means usually one or two letters, and
never more than four or five).
A general guideline followed by most people is to have the length of >identifiers (or their semantic content) increase with larger scope of
the identifier. "i" is fine as a counter of a small loop, but you would
not want to use it for a file-scope static.
Which abbreviations are appropriate is often context-dependent. As long
as the context is clear, they can be very helpful - in a genetics
program, you would definitely want to use "DNA_string" in preference to >"deoxyribonucleic_acid_string" as an identifier!
But I dislike it when people use things like "indx" for "index" or "cnt"
for "count".
A general guideline followed by most people is to have the length of identifiers (or their semantic content) increase with larger scope of
the identifier. "i" is fine as a counter of a small loop, but you would
not want to use it for a file-scope static.
[...]
But I dislike it when people use things like "indx" for "index" or "cnt"
for "count".
On 03.12.2024 02:23, Tim Rentsch wrote:
In my view case-insensitive matching/lookup is clearly worse than
case-sensitive matching. There may be some contexts where a
case-insensitive rule is tolerable or even preferable, but offhand
I'm not thinking of one. Of course sometimes I do want matching
to allow either case, for which 'grep -i' or some other common
tool solves the problem; the key is that it's my choice, not
a fixed choice imposed by a procrustean software system.
These days, where case-sensitive data is normal, the often seen >case-insensitive default for searching is a pain, IMO.
On 02/12/2024 22:53, Bart wrote:
So, how would you have organised the 16-module example I posted
elsewhere? (Not a C project, these are 16 source files, so no headers
etc.)
Because two posters here have suggested my organisation is poor, but
without knowing how big, small, or complex my projects are.
No one (as far as I have noticed) have said that your organisation /is/
poor - they have said it /sounds/ poor from the way you describe it. The difference is very significant.
For file organisation, I'd likely have all the modules in one directory unless there is a particular reason to split them up. I would not have
any non-project files in that directory.
But the questions raised about your organisation was not a matter of
where you store your files, or how they are divided in directories. It
is about how you organise the code and split functionality between files
(or directories, for bigger projects).
What you have described is modules that have far too much in one file, modules with little or no structure as to where things are in the file,
little to no structure or control over which modules use facilities from which other modules, and completely random inter-module dependencies
which can happily be circular.
These opinions are formed from how you describe your code and your
language.
BTW your project doesn't sound that big, especially if you have a
penchant for having a larger number of smaller files.
A line-count would give a better idea.
For the whole project (relevant to things like the build and the organisation):
----------------------------------------------------------------------- Language files blank comment code
----------------------------------------------------------------------- C 155 13849 34547 80139
C/C++ Header 284 13719 62329 61056
C++ 43 2447 1230 13009
----------------------------------------------------------------------- SUM: 482 30015 98106 154204
-----------------------------------------------------------------------
For the project-specific code, rather than libraries, SDK, etc.:
----------------------------------------------------------------------- Language files blank comment code
----------------------------------------------------------------------- C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
----------------------------------------------------------------------- SUM: 86 3554 2055 15668
------------------------------------------------------------------------
On 03/12/2024 13:34, Bart wrote:
[...]
Of course lots of C programmers have never read the standards. But they generally know that they don't know these details, and don't try to
pretend that the language details that they don't know are confusing.
They simply get on with the job of writing clear C code as best they can.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
These days, where case-sensitive data is normal, the often seen
case-insensitive default for searching is a pain, IMO.
In which utility is the default case insensitivity?
vim
and grep, for example, require options to provide case insensitive
searches. (grep -i, vim :set ic, locate -i, et alia).
I suppose windows might do something so useless.
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any order of
function definition, without requiring forward declarations to use a
function before it is defined. But C is not like that, and I cannot
honestly say it bothers me one way or the other. And apparently, it
does not particularly bother many people - there is, I think, no
serious impediment or backwards compatibility issue that would
prevent C being changed in this way. Yet no one has felt the need
for it - at least not strongly enough to fight for it going in the
standard or being a common compiler extension.
I think, arguing in favor of such change would be easier on top of
the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more categorical
claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem here?
What, specifically, are you thinking of?
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
So, how would you have organised the 16-module example I posted
elsewhere? (Not a C project, these are 16 source files, so no headers
etc.)
Because two posters here have suggested my organisation is poor, but
without knowing how big, small, or complex my projects are.
No one (as far as I have noticed) have said that your organisation
/is/ poor - they have said it /sounds/ poor from the way you describe
it. The difference is very significant.
For file organisation, I'd likely have all the modules in one
directory unless there is a particular reason to split them up. I
would not have any non-project files in that directory.
But the questions raised about your organisation was not a matter of
where you store your files, or how they are divided in directories.
It is about how you organise the code and split functionality between
files (or directories, for bigger projects).
What you have described is modules that have far too much in one file,
modules with little or no structure as to where things are in the file,
Because it doesn't really matter.
(My language allows out-of-order
everything. So all module-scope variables could go at the end of the
file, or file-scope variables at the end of the function! However I
don't do that.
But I really don't want to care about whether function F precedes G in
the file, or follows it. Any more than I would care whether a file "F"
is stored before or after file "G" in a directory! The ordering could be sorted in different ways for display; perhaps an editor could do the
same. (I guess your IDE does that.)
little to no structure or control over which modules use facilities
from which other modules, and completely random inter-module
dependencies which can happily be circular.
They can be circular when it makes sense. They are after all part of the
same program!
In C, if you have 100 modules, but modules 23 and 87 need to share some variable or function, it can be visible to the other 98 too, or can
clash with the same name thaty 17 and 26 want to share. Or with a name
that module 72 forgot to make static.
Or module 49 exports variable 'abc' as int, but 53 imports it as
'char*', then fun and games follow. C has a lot worse problems!
On 03/12/2024 16:47, Bart wrote:
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
So, how would you have organised the 16-module example I posted
elsewhere? (Not a C project, these are 16 source files, so no
headers etc.)
Because two posters here have suggested my organisation is poor, but
without knowing how big, small, or complex my projects are.
No one (as far as I have noticed) have said that your organisation
/is/ poor - they have said it /sounds/ poor from the way you describe
it. The difference is very significant.
For file organisation, I'd likely have all the modules in one
directory unless there is a particular reason to split them up. I
would not have any non-project files in that directory.
But the questions raised about your organisation was not a matter of
where you store your files, or how they are divided in directories.
It is about how you organise the code and split functionality between
files (or directories, for bigger projects).
What you have described is modules that have far too much in one
file, modules with little or no structure as to where things are in
the file,
Because it doesn't really matter.
It really /does/ matter - regardless of what the language allows or does
not allow.
If your language does not enforce ordering rules,
that gives
you more flexibility, but it does not relieve you of your responsibility
as a programmer of writing code in a logical and structured manner.
In C, if you have 100 modules, but modules 23 and 87 need to share
some variable or function, it can be visible to the other 98 too, or
can clash with the same name thaty 17 and 26 want to share. Or with a
name that module 72 forgot to make static.
C has a risk of name clashes - that's why I am a fan of namespaces
(proper ones, not your weird half-arsed solution).
Or module 49 exports variable 'abc' as int, but 53 imports it as
'char*', then fun and games follow. C has a lot worse problems!
That will be caught at link time, if not before
On 12/2/2024 12:13 PM, Scott Lurndal wrote:
Indeed. One wonders at Bart's familiarity with formal grammars.
In my case, personally I haven't encountered much that doesn't work well enough with a recursive-descent parser.
On Fri, 29 Nov 2024 20:38:51 -0500
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
On 11/29/24 19:55, Waldek Hebisch wrote:
...
Hmm, in well-written code static functions are likely to be a
majority. Some people prefer to declare all functions and
put declarations of static functions in the same file as the
functions itself. Conseqently, function declarations are not
rare in such code. Do you consider it well-written?
I wouldn't go so far as to say that it's poorly written, but I don't
like the unnecessary redundancy of that approach. Whenever possible, I
prefer to let each static function's definition serve as it's only
declaration. This isn't possible, for instance, if you have a pair of
mutually recursive functions.
The redundancy between a header file's function declaration and the
corresponding function definition is necessary, given the way that C
works. Avoiding that is one of the reasons I like declaring static
functions, where appropriate.
Top-down-minded people don't like details textually preceding "big
picture".
[O.T.]
Better solution would be if static function definition anywhere in the
file serves like declaration (prototype) for the whole file, including preceding part. We are long past the time where single-pass compiler
was a legit argument against such arrangement. Nowadays the only
possible counter argument would be breaking existing code. But I don't
see how such change breaks anything.
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
----------------------------------------------------------------------- Language files blank comment code
----------------------------------------------------------------------- C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
----------------------------------------------------------------------- SUM: 86 3554 2055 15668
------------------------------------------------------------------------
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
-----------------------------------------------------------------------
Language files blank comment code
-----------------------------------------------------------------------
C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
-----------------------------------------------------------------------
SUM: 86 3554 2055 15668
------------------------------------------------------------------------
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
I don't really know what they means, because I don't know is involved,
and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on my machine), I set up a comparable test. I took an 800-line C program, and duplicated it in 200 separate files f1.c to f200.c, for a total line
count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it can't
link due to Win32 imports).
Tiny C took 0.64 seconds.
My two C compilers were rubbish so I won't report those timings (the
newer one only works a file at a time anyway so needs invoking 200 times).
If someone wanted to ensure that all static functions defined in a translation unit are declared near the top, there could be a separate
tool to generate, or at least check, the declarations. I'm not aware of
any such tool, which suggests there probably isn't much demand for it.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
it's not fussy about semicolons,
From the languages I know of in detail and I'm experienced in none
is "fussy" about semicolons. Rather it's a simple and well designed
syntactical token, whether used as separator or terminator. You've
just to put it where it's defined.
Indeed. One wonders at Bart's familiarity with formal grammars.
David Brown <david.brown@hesbynett.no> writes:
On 03/12/2024 02:23, Tim Rentsch wrote:
scott@slp53.sl.home (Scott Lurndal) writes:
For the most part I don't use abbreviations in the usual sense of
the word, although I do sometimes use short non-words in a small
local context (here "short" means usually one or two letters, and
never more than four or five).
A general guideline followed by most people is to have the length of
identifiers (or their semantic content) increase with larger scope of
the identifier. "i" is fine as a counter of a small loop, but you would
not want to use it for a file-scope static.
Which abbreviations are appropriate is often context-dependent. As long
as the context is clear, they can be very helpful - in a genetics
program, you would definitely want to use "DNA_string" in preference to
"deoxyribonucleic_acid_string" as an identifier!
I agree with both of these. In addition, when processing
character strings, I'll often use 'cp' as a character pointer.
On 02/12/2024 21:06, David Brown wrote:
On 02/12/2024 19:27, Bart wrote:
On 02/12/2024 15:24, David Brown wrote:
On 02/12/2024 13:24, Bart wrote:
I would consider any kind of automatic import system to be a bad idea.
It's not automatic; you have to list the modules. If you don't want
data shared, then don't export it!
As I understood it, or perhaps failed to understand it (after all, it's
not as if there are published specifications, references,
https://github.com/sal55/langs/blob/master/Modules24.md
(A summary, mainly for my benefit, about a year old.)
tutorials or
examples for your language)
I gave an example in my post, of the lead module for my C compiler.
Older examples are in that link.
In all (counting the 'pcl' backend library) it comprises 36 source files
in two folders; plus about 40 C headers in two folders, which are
embedded into the executable.
This information is enough for the compiler, and nothing else, to locate
all the necessary files and to build an EXE file.
you would list your modules for a program
in one place, and then other modules in the program automatically import
them all.
The logic used in the name resolution pass of the compiler allows
visibiliy without needing full name-qualifying.
I can rename any module, and I only need to change the one file. I can
move an exported function from one module to another, without making any other changes (eg. change all xxx.F() to yyy.F()).
On 12/3/2024 1:27 PM, Janis Papanagnou wrote:
On 03.12.2024 19:57, BGB wrote:
On 12/2/2024 12:13 PM, Scott Lurndal wrote:
Indeed. One wonders at Bart's familiarity with formal grammars.
In my case, personally I haven't encountered much that doesn't work well >>> enough with a recursive-descent parser.
Is that meant as a contradiction? - If so, how?
Formal grammars and parser generators aren't usually necessary IME,
since recursive descent can deal with most everything (and is generally
more flexible than what one can deal with in a formal grammar).
On 03/12/2024 18:02, David Brown wrote:
On 03/12/2024 16:47, Bart wrote:
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
So, how would you have organised the 16-module example I posted
elsewhere? (Not a C project, these are 16 source files, so no
headers etc.)
Because two posters here have suggested my organisation is poor,
but without knowing how big, small, or complex my projects are.
No one (as far as I have noticed) have said that your organisation
/is/ poor - they have said it /sounds/ poor from the way you
describe it. The difference is very significant.
For file organisation, I'd likely have all the modules in one
directory unless there is a particular reason to split them up. I
would not have any non-project files in that directory.
But the questions raised about your organisation was not a matter of
where you store your files, or how they are divided in directories.
It is about how you organise the code and split functionality
between files (or directories, for bigger projects).
What you have described is modules that have far too much in one
file, modules with little or no structure as to where things are in
the file,
Because it doesn't really matter.
It really /does/ matter - regardless of what the language allows or
does not allow.
Why?
In C, if you have 100 modules, but modules 23 and 87 need to share
some variable or function, it can be visible to the other 98 too, or
can clash with the same name thaty 17 and 26 want to share. Or with a
name that module 72 forgot to make static.
C has a risk of name clashes - that's why I am a fan of namespaces
(proper ones, not your weird half-arsed solution).
What's wrong with my solution? You seem to be making assumptions about it.
All it does is allow you to write F() instead of A.F(). You can do the
same thing in C++ (there it saves you writing A::), by doing this (AIUI):
using A;
I could spend 30 minutes in providing an option so that it needs to be explicit like this, but I don't have a pressing need to do so.
BTW what happens in C++ when you do this:
using A;
using B;
F();
and both A and B export (or make public) F? What happens if there is
also a locally defined F?
Or module 49 exports variable 'abc' as int, but 53 imports it as
'char*', then fun and games follow. C has a lot worse problems!
That will be caught at link time, if not before
Is it?
c:\cx>type a.c
extern void F(void);
int abc;
int main(void) {
abc=12345;
F();
}
c:\cx>type b.c
#include <stdio.h>
extern char* abc;
void F() {
puts(abc);
}
c:\cx>gcc a.c b.c
c:\cx>a
....
This crashes. This program is impossible to write in my language when
both modules are part of the program.
Only when the two functions are in different binaries so that one
program needs to work with a potentially incorrect declaration. Even
then, generating a DLL can also export an interface file with the
correct declarations.
Then it can only go wrong if one binary is updated and recompiled, but
not the other. But this applies to any language.
So it's a lot more fool-proof.
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
-----------------------------------------------------------------------
Language files blank comment code
-----------------------------------------------------------------------
C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
-----------------------------------------------------------------------
SUM: 86 3554 2055 15668
------------------------------------------------------------------------
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
I don't really know what they means, because I don't know is involved,
and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on my machine), I set up a comparable test. I took an 800-line C program, and duplicated it in 200 separate files f1.c to f200.c, for a total line
count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it can't
link due to Win32 imports).
Tiny C took 0.64 seconds.
My two C compilers were rubbish so I won't report those timings (the
newer one only works a file at a time anyway so needs invoking 200 times).
However the equivalent test in my language, of 160 versions of a
somewhat larger module, took 0.57 seconds (unoptimised compiler).
Anyway, the gcc timings give a different picture from your 1-3 seconds.
On 03/12/2024 21:51, Bart wrote:
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
-----------------------------------------------------------------------
Language files blank comment code
-----------------------------------------------------------------------
C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
-----------------------------------------------------------------------
SUM: 86 3554 2055 15668
------------------------------------------------------------------------ >>>
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
I don't really know what they means, because I don't know is involved,
and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on my
machine), I set up a comparable test. I took an 800-line C program,
and duplicated it in 200 separate files f1.c to f200.c, for a total
line count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it can't
link due to Win32 imports).
Tiny C took 0.64 seconds.
My two C compilers were rubbish so I won't report those timings (the
newer one only works a file at a time anyway so needs invoking 200
times).
The timing from one was about 9 seconds when it was generating 200 .obj files, before I tweaked things to allow an EXE to be created (by using 'static' in each file to avoid clashes).
If I also include my original compiler when I strived to make multiple
files more efficient, then timings on Windows are:
mm 0.53 seconds (Non-C test inputs)
tcc 0.64 seconds
bcc 1 second
mcc 3 seconds
cc 5 seconds (to produce 200 .obj files)
gcc 32 seconds (-O0, v.14.1; the WSL timing was for 9.4)
The original 93 seconds for gcc-O2 was for a test using non-static
functions and generating 200 .o files. I reran it for EXE using -O0 but
not with -O2.
If I do so now, gcc-O2 takes 18 seconds, but it produces a too-small EXE file. Presumably it is largely discarding the 199 modules that comprise
only static functions.
So that timing is erroneous.
On 12/3/2024 1:27 PM, Janis Papanagnou wrote:
On 03.12.2024 19:57, BGB wrote:
On 12/2/2024 12:13 PM, Scott Lurndal wrote:
Indeed. One wonders at Bart's familiarity with formal grammars.
In my case, personally I haven't encountered much that doesn't work well >>> enough with a recursive-descent parser.
Is that meant as a contradiction? - If so, how?
Formal grammars and parser generators aren't usually necessary IME,
since recursive descent can deal with most everything (and is generally
more flexible than what one can deal with in a formal grammar).
Though, it seems I didn't read enough, they were debating about
syntactic use of semicolons as a separator rather than parser writing...
[...]
On 02/12/2024 18:13, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
OK. I think the opposite. So who's right?
On 03/12/2024 18:42, Waldek Hebisch wrote:^^
David Brown <david.brown@hesbynett.no> wrote:
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any order of >>>>> function definition, without requiring forward declarations to use a >>>>> function before it is defined. But C is not like that, and I cannot >>>>> honestly say it bothers me one way or the other. And apparently, it >>>>> does not particularly bother many people - there is, I think, no
serious impediment or backwards compatibility issue that would
prevent C being changed in this way. Yet no one has felt the need
for it - at least not strongly enough to fight for it going in the
standard or being a common compiler extension.
I think, arguing in favor of such change would be easier on top of
the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more categorical
claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem here? >>>
What, specifically, are you thinking of?
Michael probably meant 'constexpr'. AFAICS the are on troubles with
automatic reordering of "pure" declarations. But variable declarations
may have initialization and 'constexpr' allows not entirely trivial
expressions for initialization. And C wants to be compatible with
C++, where even now initialization is much more "interesting".
So, reordering variable declarations is problematic due to
initalization and it would be ugly to have special case for
function declarations.
Prior to C23 you could have non-trivial expressions for initialisations
that depend on the order of the declarations in the code:
enum { a = 1, b = 10, c = 100 };
const int y = (a * b) - (c / 10);
constexpr in C23 gives you a lot of flexibility to do more here, using different types (not just "int"). Prior to C23, people would use
#define'd macros:
#define pi 3.14159265359
const double zeta_2 = (pi * pi) / 6;
constexpr does not actually change the principles here, it just makes
them clearer and neater.
Since macros are not scoped, and can be undefined and redefined, they
will always be an issue for any re-ordering of code. Replacing such "literal" macros with constexpr declarations would make it a lot easier
to support a single wide file-level scope where declaration order does
not matter, rather than making it harder.
David Brown <david.brown@hesbynett.no> wrote:
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any order of
function definition, without requiring forward declarations to use a
function before it is defined. But C is not like that, and I cannot
honestly say it bothers me one way or the other. And apparently, it
does not particularly bother many people - there is, I think, no
serious impediment or backwards compatibility issue that would
prevent C being changed in this way. Yet no one has felt the need
for it - at least not strongly enough to fight for it going in the
standard or being a common compiler extension.
I think, arguing in favor of such change would be easier on top of
the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more categorical
claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem here?
What, specifically, are you thinking of?
Michael probably meant 'constexpr'. AFAICS the are on troubles with automatic reordering of "pure" declarations. But variable declarations
may have initialization and 'constexpr' allows not entirely trivial expressions for initialization. And C wants to be compatible with
C++, where even now initialization is much more "interesting".
So, reordering variable declarations is problematic due to
initalization and it would be ugly to have special case for
function declarations.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
If someone wanted to ensure that all static functions defined in a
translation unit are declared near the top, there could be a
separate tool to generate, or at least check, the declarations.
I'm not aware of any such tool, which suggests there probably
isn't much demand for it.
What it suggests to me is that there are tools being used that
you aren't aware of.
What that suggests to me is that you feel the need to snipe without
sharing any actual information.
Of course there are tools that I'm not aware of. I would have
thought that would be too obvious to bother stating.
Are you aware of any such tools? If so, do you have some reason
for concealing information about them? If not, what is the basis
for your assumption that such tools exist?
On 03.12.2024 16:24, David Brown wrote:
On 03/12/2024 13:34, Bart wrote:
[...]
Of course lots of C programmers have never read the standards. But they
generally know that they don't know these details, and don't try to
pretend that the language details that they don't know are confusing.
They simply get on with the job of writing clear C code as best they can.
I feel an urge to point out that the C standard is not necessary to understand concepts that can be read about in textbooks or inferred
(or just tried out, if the available sources are unclear about some
detail).
David Brown <david.brown@hesbynett.no> wrote:
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any
order of function definition, without requiring forward
declarations to use a function before it is defined. But C is
not like that, and I cannot honestly say it bothers me one way or
the other. And apparently, it does not particularly bother many
people - there is, I think, no serious impediment or backwards
compatibility issue that would prevent C being changed in this
way. Yet no one has felt the need for it - at least not strongly
enough to fight for it going in the standard or being a common
compiler extension.
I think, arguing in favor of such change would be easier on top of
the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more categorical
claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem
here?
What, specifically, are you thinking of?
Michael probably meant 'constexpr'.
On 03/12/2024 11:15, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
...
If I write thisA small correction: that declaration gives all three names the same
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable with type 'ref i32' (expressed in other syntax) >>> B is a local variable with type '[10]i32'
C is a function with return type of 'i32', taking one unnamed
parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same declaration can >>> also create names in different scopes!)
scope[1].
This is what I observed my compiler doing, because it displays the symbol table. It puts C into module-scope, so I can access it also from another function without another declaration (so non-conforming, but I'm long past caring).
I can't see gcc's symbol table, but I can't access C from another function without it having its own declaration, or there being a module-scope one.
With gcc, such a declaration inside a function suffices to be able access a function 'C' defined later in the file.
You are confusing scope with linkage.
It's possible. So a function declaration inside a function gives the name external linkage (?).
Which in this context means the function will be outside this one, but elsewhere in the module, rather than being imported from elsewhere
module.
If I say I find these quirks of C confusing, people will have another go at me. So let's say it makes perfect sense for 'extern' to mean two different things!
You've never used my scheme.
One significant advantage is that because
all modules (and subprogram imports) are listed in the lead module
(usually that's all it contains), it is very easy to build a different configuration using an alternative lead module with a different collection.
On Tue, 3 Dec 2024 17:42:20 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any
order of function definition, without requiring forward
declarations to use a function before it is defined. But C is
not like that, and I cannot honestly say it bothers me one way or
the other. And apparently, it does not particularly bother many
people - there is, I think, no serious impediment or backwards
compatibility issue that would prevent C being changed in this
way. Yet no one has felt the need for it - at least not strongly
enough to fight for it going in the standard or being a common
compiler extension.
I think, arguing in favor of such change would be easier on top of
the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more categorical
claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem
here?
What, specifically, are you thinking of?
Michael probably meant 'constexpr'.
No, I am afraid of cases where function is used without prototype and
then there is conflicting definition later in the module.
Of course, it's UB, but in practice it could often work fine.
Something like that:
static int bar();
int foo(void)
{
return bar(42);
}
static int bar(int a, int b)
{
if (a == 42)
return -1;
return a - b;
}
Under c23 rules the code above is illegal, but before c23 it's merely
a UB.
On Wed, 4 Dec 2024 13:15:53 +0100
David Brown <david.brown@hesbynett.no> wrote:
On 04/12/2024 12:56, Michael S wrote:
On Tue, 3 Dec 2024 17:42:20 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any
order of function definition, without requiring forward
declarations to use a function before it is defined. But C is
not like that, and I cannot honestly say it bothers me one way
or the other. And apparently, it does not particularly bother
many people - there is, I think, no serious impediment or
backwards compatibility issue that would prevent C being
changed in this way. Yet no one has felt the need for it - at
least not strongly enough to fight for it going in the standard
or being a common compiler extension.
I think, arguing in favor of such change would be easier on top
of the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more
categorical claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem
here?
What, specifically, are you thinking of?
Michael probably meant 'constexpr'.
No, I am afraid of cases where function is used without prototype
and then there is conflicting definition later in the module.
Of course, it's UB, but in practice it could often work fine.
Something like that:
static int bar();
int foo(void)
{
return bar(42);
}
static int bar(int a, int b)
{
if (a == 42)
return -1;
return a - b;
}
Under c23 rules the code above is illegal, but before c23 it's
merely a UB.
I think it is always better to have a hard error than to allow UB!
But this is not actually anything to do with ordering of functions or
declarations. You could omit "foo" entirely and the code is still an
error in C23, because "static int bar();" /is/ a prototype
declaration in C23 - it means the same as "static int bar(void);".
In case you lost the context, in the post above I am explaining why I
think that the suggested change that turns any static function
definition into "retroactive" module-scale prototype declaration can potentially break working pre-C23 code. On the other hand, so far I was
not able to imagine a case in which such change would break C23 code.
And that is the reason why I think that the change is easier to
introduce on top of C23.
On 04/12/2024 12:56, Michael S wrote:
On Tue, 3 Dec 2024 17:42:20 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 01/12/2024 17:57, Michael S wrote:
On Sun, 1 Dec 2024 15:34:04 +0100
David Brown <david.brown@hesbynett.no> wrote:
I can see some advantages in a language being happy with any
order of function definition, without requiring forward
declarations to use a function before it is defined. But C is
not like that, and I cannot honestly say it bothers me one way
or the other. And apparently, it does not particularly bother
many people - there is, I think, no serious impediment or
backwards compatibility issue that would prevent C being
changed in this way. Yet no one has felt the need for it - at
least not strongly enough to fight for it going in the standard
or being a common compiler extension.
I think, arguing in favor of such change would be easier on top
of the changes made in C23.
Before C23 there were, as you put it "no serious impediment or
backwards compatibility issue". After C23 we could more
categorical claim that there are no new issues.
Does that mean there was something that you think was allowed in C
before C23, but not after C23, that would potentially be a problem
here?
What, specifically, are you thinking of?
Michael probably meant 'constexpr'.
No, I am afraid of cases where function is used without prototype
and then there is conflicting definition later in the module.
Of course, it's UB, but in practice it could often work fine.
Something like that:
static int bar();
int foo(void)
{
return bar(42);
}
static int bar(int a, int b)
{
if (a == 42)
return -1;
return a - b;
}
Under c23 rules the code above is illegal, but before c23 it's
merely a UB.
I think it is always better to have a hard error than to allow UB!
But this is not actually anything to do with ordering of functions or declarations. You could omit "foo" entirely and the code is still an
error in C23, because "static int bar();" /is/ a prototype
declaration in C23 - it means the same as "static int bar(void);".
On 03/12/2024 19:42, Bart wrote:
It really /does/ matter - regardless of what the language allows or
does not allow.
Why?
code the reading is important order people to.
Or, if you prefer,
Order is important to people reading the code.
The compiler spends milliseconds reading the code. Programmers spend
hours, days, or more reading the code. It's not particularly important
if the language requires a particular order or not, except in how it
helps or hinders the programmer in their order of the code.
And it is
certainly the case that different programmers will prefer different ways
to order and arrange their code - but that does not stop it being important. When you write code, write it for human readers - other
I've already made it clear what I think is wrong about your solution -
the jumbling of namespaces. (And /please/ don't harp on about C's
system again - the fact that C does not have a good way of handling namespaces does not suddenly make /your/ version good.)
All it does is allow you to write F() instead of A.F(). You can do the
same thing in C++ (there it saves you writing A::), by doing this (AIUI):
using A;
(You mean "using namespace A;". It's no problem that you don't know the right syntax for C++, but I'm correcting it in case you want to try
anything on godbolt.org.)
Yes, C++ /allows/ you to do that - if you explicitly choose to do so for
a particular namespace. Thus if you are going to use identifiers from a namespace often, and you are confident it will not lead to conflicts,
then you can do so. C++ "using namespace A;" is commonly used in a few circumstances:
Having every module in a program automatically pull in every exported identifier from every other module in the program is not structured or modular programming - it is anarchy.
in every exported identifier
You get an error if you try to use "F", as it is ambiguous.
(Defining a
new local function F is also an error even if F is never called.)
This crashes. This program is impossible to write in my language when
both modules are part of the program.
I'm sorry, I thought you meant if a sane C programmer wrote good code
but accidentally had conflicting types. C is not as tolerant of idiots
as some languages.
So it's a lot more fool-proof.
If you are a fool, you should probably avoid programming entirely.
Languages and tools should try to be accident-proof, not fool-proof.
A hand written recursive descent parser will have the best tooling.
Bart <bc@freeuk.com> wrote:
You've never used my scheme.
Your scheme, not. But you should understand that when speaking
about module systems C is an outlier, having almost no support
for modules. Some languages, like C++ and Lisp go half way,
providing namespaces but rest is up to programmer. Other do
more. By now classis is _logical_ separation into interface
and implementation, which seem to be absent from your system.
On 12/3/2024 1:27 PM, Janis Papanagnou wrote:
On 03.12.2024 19:57, BGB wrote:
On 12/2/2024 12:13 PM, Scott Lurndal wrote:
Indeed. One wonders at Bart's familiarity with formal grammars.
In my case, personally I haven't encountered much that doesn't work well >>> enough with a recursive-descent parser.
Is that meant as a contradiction? - If so, how?
Formal grammars and parser generators aren't usually necessary IME,
since recursive descent can deal with most everything (and is generally
more flexible than what one can deal with in a formal grammar).
On 02/12/2024 18:13, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
OK. I think the opposite. So who's right?
From the languages I know of in detail and I'm experienced in none
is "fussy" about semicolons. Rather it's a simple and well designed
syntactical token, whether used as separator or terminator. You've
just to put it where it's defined.
Indeed. One wonders at Bart's familiarity with formal grammars.
Why?
Bart <bc@freeuk.com> writes:
On 03/12/2024 11:15, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
...
If I write this
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable [...]
B is a local variable [...]
C is a function with return type of 'i32', taking one
unnamed parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same
declaration can also create names in different scopes!)
A small correction: that declaration gives all three names the
same scope[1].
This is what I observed my compiler doing, because it displays the
symbol table. It puts C into module-scope, so I can access it also
from another function without another declaration (so
non-conforming, but I'm long past caring).
You invented the term "module-scope" and that means you've lost
track of what the scope of a name is in C. [...]
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
-----------------------------------------------------------------------
Language files blank comment code
-----------------------------------------------------------------------
C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
-----------------------------------------------------------------------
SUM: 86 3554 2055 15668
------------------------------------------------------------------------
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
I don't really know what they means, because I don't know is involved,
and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on my >machine), I set up a comparable test. I took an 800-line C program, and >duplicated it in 200 separate files f1.c to f200.c, for a total line
count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it can't
link due to Win32 imports).
Tiny C took 0.64 seconds.
Ben Bacarisse <ben@bsb.me.uk> writes:
Bart <bc@freeuk.com> writes:
On 03/12/2024 11:15, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
...
If I write this
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable [...]
B is a local variable [...]
C is a function with return type of 'i32', taking one
unnamed parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same
declaration can also create names in different scopes!)
That means your compiler is not compiling standard C. In standard C
all entities declared locally have block scope, not file scope.
That doesn't invalidate your reaction to Bart's statement, because
he is being sloppy with language,
Bart <bc@freeuk.com> writes:
On 02/12/2024 18:13, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
OK. I think the opposite. So who's right?
The consensus so far does not favor your viewpoint.
From the languages I know of in detail and I'm experienced in none
is "fussy" about semicolons. Rather it's a simple and well designed
syntactical token, whether used as separator or terminator. You've
just to put it where it's defined.
Indeed. One wonders at Bart's familiarity with formal grammars.
Why?
Because having a formal grammar is the keystone of a good
language design.
On 04/12/2024 16:57, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 02/12/2024 18:13, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
I don't think that case-insensitivity is a Good Thing. (I also don't >>>>> think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
OK. I think the opposite. So who's right?
The consensus so far does not favor your viewpoint.
From the languages I know of in detail and I'm experienced in none
is "fussy" about semicolons. Rather it's a simple and well designed
syntactical token, whether used as separator or terminator. You've
just to put it where it's defined.
Indeed. One wonders at Bart's familiarity with formal grammars.
Why?
Because having a formal grammar is the keystone of a good
language design.
Do you use a formal grammar when parsing a CSV file, or something
equally trivial?
[...]
Do you use a formal grammar when parsing a CSV file, or something
equally trivial?
Parsers don't need such grammars. The information needed to construct
one manually can be in the programmer's mind,
it might be an informal
description in English, it might be defined by examples of an existing languages.
Or the parser was created for a trivial language and has evolved.
A formal or informal grammar might be useful in a language reference -
once the language is stable.
[...]
For the rest of us, that part is the simplest part of a compiler. You
write it, and move on.
On 12/4/2024 10:49 AM, Scott Lurndal wrote:
How can you write a correct recursive descent parser without a
formal grammar (at least on paper) for the language being parsed?
It is usually a thing of pattern matching the next N tokens, based on
the structure of the language.
You can write or reference a syntax in BNF or EBNF or similar, but it is
not necessary, and some languages (like C) may contain things that can't
be fully expressed via an BNF (say, for example, things that depend on
prior typedefs, etc).
[snip rest; tldr]
On 03/12/2024 17:12, Janis Papanagnou wrote:
On 03.12.2024 16:24, David Brown wrote:
On 03/12/2024 13:34, Bart wrote:
[...]
Of course lots of C programmers have never read the standards. But they >>> generally know that they don't know these details, and don't try to
pretend that the language details that they don't know are confusing.
They simply get on with the job of writing clear C code as best they
can.
I feel an urge to point out that the C standard is not necessary to
understand concepts that can be read about in textbooks or inferred
(or just tried out, if the available sources are unclear about some
detail).
Sure. And it is certainly /possible/ to know all the small details of C without ever reading the standards. But it's quite unlikely.
[...]
Bart is an expert at thinking up things in C that confuse him.
In the
real world, programmers simply don't do that kind of thing - and the
kind of C programmer who is interested in these details will almost
certainly also be interested in reading the standards.
Most C programmers never look at the standards - textbooks, decent
reference sites (like www.cppreference.com), common sense, and trial and error with quality compilers is sufficient for most programmers.
There's nothing at all wrong with not knowing [...]
The second is trying to pass the buck - blame your
tools, blame other people, blame anyone but yourself. [...]
Bart <bc@freeuk.com> writes:
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
-----------------------------------------------------------------------
Language files blank comment code
-----------------------------------------------------------------------
C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
-----------------------------------------------------------------------
SUM: 86 3554 2055 15668
------------------------------------------------------------------------ >>>
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
I don't really know what they means, because I don't know is involved,
and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on my
machine), I set up a comparable test. I took an 800-line C program, and
duplicated it in 200 separate files f1.c to f200.c, for a total line
count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it can't
link due to Win32 imports).
Tiny C took 0.64 seconds.
Here's a typical build for my project:
$ make -s -j 64
...
COMPILE apc2.cpp
COMPILE bcp2.cpp
BUILD /scratch/lib/libdevice_yyy.so
XXXX_BUILT
BUILDSO /scratch/libxxxx.so.1.0
BUILD TARGET /scratch/bin/xxxx
COMPILE shim.cpp
real 13m48.24s
user 1h2m9.78s
sys 0m56.73s
13 minutes by the wall clock, a bit over an hour
of CPU time.
On 04.12.2024 18:43, Bart wrote:
[...]
Do you use a formal grammar when parsing a CSV file, or something
equally trivial?
Is that the reason why there's so many versions around that are
incompatible? CSV-parsing is not "trivial" if you look into the
details; you have to specify these (at first glance not obvious
details) to be sure that your "CSV-data" works not only with your "CSV-parser" but that there's a common understanding of the CSV-
"language". (It's only few details; delimiters in string values,
escapes, and such, but enough to initiate incompatible formats.)
Yes, of course; if there would have been a formal specification
in the first place we wouldn't have the mess we now actually have.
And if you anyway write your tools only for yourself, and if you
don't intend to exchange data with others, no one cares what you
think a/the "CSV-format" actually is.
But we weren't discussing such comparably simple structures; we
have been discussing programming languages (and their grammars).
And most of us are considering sensible languages, not privately
hacked up toy languages or implementations of personal hobbies.
Human languages (specifically English with its many irregularities)
are worse than a formal language and a unreliable base and unsuited
as programming languages.
That's nothing more then a hacker's feeble excuse to justify his
ignorance.
Or the parser was created for a trivial language and has evolved.
Like, as I've heard, the Unix shell, with all it's irregularities
and quirks? - You really think this is a good paragon?
Even if some hacker defines a language ad hoc - and Intercal comes
to my mind - for a serious programming language you should have
[documented] syntax and semantics, and why the hell would anyone
use English instead of a formal specification for syntax.
A formal or informal grammar might be useful in a language reference -
once the language is stable.
For a hacker like you maybe. Professionals typically specify before implementation.
[...]
For the rest of us, that part is the simplest part of a compiler. You
write it, and move on.
Is that the reason why your languages and compilers are so widespread
used? </sarcasm>
On 04.12.2024 13:08, David Brown wrote:
Sure. And it is certainly /possible/ to know all the small details of C
without ever reading the standards. But it's quite unlikely.
The question is (IMO) not so much to know "all" and even "all small"
details. Even in a language like "C" (that I'd consider to be fairly incoherent if compared to other languages' design) you can get all "important" language properties (including details) from textbooks.
If cases where that is different, the standards documents - which
have their very own special way of being written - would be even
less comprehensibly as they (inherently) already are. - That said
from a programmer's POV (not from the language implementors').
I look into language standards only if I want to confirm/falsify
an implementation; in this case I'm taking the role of a language
implementor (not a programmer). Personally I do that anyway only
rarely, for specific languages only, and just out of academical
interest.
[...]
Bart is an expert at thinking up things in C that confuse him.
Well, that's an own topic. - Where I was really astonished was the
statement of being confused about the braces/semicolons, which is
so fundamental (and primitive) but technically just a detail that
I'd thought it should be clear
On 04/12/2024 09:02, David Brown wrote:
On 03/12/2024 19:42, Bart wrote:
It really /does/ matter - regardless of what the language allows or
does not allow.
Why?
code the reading is important order people to.
Or, if you prefer,
Order is important to people reading the code.
The compiler spends milliseconds reading the code. Programmers spend
hours, days, or more reading the code. It's not particularly
important if the language requires a particular order or not, except
in how it helps or hinders the programmer in their order of the code.
You've lost me.
A language source file isn't a story that needs to be consumed
sequentially. It will be a collection of functions that will have some arbitrary call-pattern at runtime, perhaps different each time.
So the static ordering is irrelevant.
I've already made it clear what I think is wrong about your solution -
the jumbling of namespaces. (And /please/ don't harp on about C's
system again - the fact that C does not have a good way of handling
namespaces does not suddenly make /your/ version good.)
All it does is allow you to write F() instead of A.F(). You can do
the same thing in C++ (there it saves you writing A::), by doing this
(AIUI):
using A;
(You mean "using namespace A;". It's no problem that you don't know
the right syntax for C++, but I'm correcting it in case you want to
try anything on godbolt.org.)
Yes, C++ /allows/ you to do that - if you explicitly choose to do so
for a particular namespace. Thus if you are going to use identifiers
from a namespace often, and you are confident it will not lead to
conflicts, then you can do so. C++ "using namespace A;" is commonly
used in a few circumstances:
It's used to avoid cluttering code with ugly xxx:: qualifiers, and save
some typing at the same time. That's pretty much it.
To that end, C++ and my language achieve the same thing.
I just decided
to make 'using namespace xxx' the default, and haven't got around to
making it optional. (In an early version, I did need such a directive.)
(However it most likely differs from C++ in what it calls 'namespaces'.
My remarks have been about the namespace that is created by a module.
I understand that C++ namespaces can be created in other ways, like
classes.
I sort of have that too, but rarely use the feature:
record foo =
proc F = println "FOO/F" end
end
record bar =
proc F = println "BAR/F" end
end
proc F = println "MAIN/F" end
proc main =
foo.F()
bar.F()
F()
end
Here, I don't need to create instances of foo and bar; they serve to encapsulate any kinds of named entities.)
This crashes. This program is impossible to write in my language when
both modules are part of the program.
I'm sorry, I thought you meant if a sane C programmer wrote good code
but accidentally had conflicting types. C is not as tolerant of
idiots as some languages.
So it's a lot more fool-proof.
If you are a fool, you should probably avoid programming entirely.
Fuck you.
Obviously you're going to shoot down whatever I say, trash
whatever I have achieved, because .... I've really no idea.
Yesterday you tried to give the misleading impression that compiling a substantial 200Kloc project only took 1-3 seconds with gcc.
I gave some timings that showed gcc-O0 taking 50 times longer than tcc,
and 150 times longer with -O2.
That is the real picture. Maybe your machine is faster than mine, but I
doubt it is 100 times faster. (If you don't like my benchmark, then
provide another in portable C.)
All this just so you can crap all over the benefits of small, faster,
simpler tools.
I'm pretty sure tcc won't compile your projects, because your projects
will be 100% dependent on the special features, extensions, and options
of your prefered tools. So that is hardly surprising.
But that's not a reason to dismiss it for everyone else.
Languages and tools should try to be accident-proof, not fool-proof.
And yet my product IS fool-proof. So again, fuck you.
On 04/12/2024 16:09, Bart wrote:
It's used to avoid cluttering code with ugly xxx:: qualifiers, and
save some typing at the same time. That's pretty much it.
Buy a better keyboard. Saving typing is a pathetic excuse.
That seems to be something akin to C++ classes or structs with static methods. But I'm not sure, and certainly not sure how it is relevant.
Of course tcc won't work - I work with embedded systems, and as I have
said countless times, your little tools don't target the devices I use.
I /need/ good optimisations - it's not an option, it's a requirement to
get fast enough code for the end system to work. So your toys would not
be usable even if they supported the right target.
C - your toys don't support that. I use C17 (and will move to C23 when
my tools support it) - tcc supports only up to partial C99 support, and $DEITY only knows what partial C standard your compiler handles.
On 04/12/2024 19:23, Janis Papanagnou wrote:
On 04.12.2024 18:43, Bart wrote:
[...]
Do you use a formal grammar when parsing a CSV file, or something
equally trivial?
Is that the reason why there's so many versions around that are
incompatible? CSV-parsing is not "trivial" if you look into the
details; you have to specify these (at first glance not obvious
details) to be sure that your "CSV-data" works not only with your
"CSV-parser" but that there's a common understanding of the CSV-
"language". (It's only few details; delimiters in string values,
escapes, and such, but enough to initiate incompatible formats.)
Yes, of course; if there would have been a formal specification
in the first place we wouldn't have the mess we now actually have.
And if you anyway write your tools only for yourself, and if you
don't intend to exchange data with others, no one cares what you
think a/the "CSV-format" actually is.
But we weren't discussing such comparably simple structures; we
have been discussing programming languages (and their grammars).
And most of us are considering sensible languages, not privately
hacked up toy languages or implementations of personal hobbies.
You really hate toy languages don't you?
The fact is that a compiler is only a bit of software like anything
else. It might take some input, process it, and produce output.
When someone has claimed to write some program, do you always demand
they produce a 'formal grammar' for it?
What is the complexity threshold anyway for something to need such a
formal specification?
The C language is one of the most quirky ones around full of
apparently ridiculous things. Why shouldn't you be able to write this
for example:
{
....
L:
}
This stupid rule means that EVERY label in my generated code needs to
be written as L:; instead of just L:
Please don't say the label is only defined to be a prefix to another statement. I asking why it was done like that.
On 04/12/2024 17:55, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
----------------------------------------------------------------------- >>>> Language files blank comment code
----------------------------------------------------------------------- >>>> C++ 39 2217 1020 11820
C/C++ Header 44 1078 798 2696
C 3 259 237 1152
----------------------------------------------------------------------- >>>> SUM: 86 3554 2055 15668
------------------------------------------------------------------------ >>>>
Not that it really matters, but a typical build for my project takes
about 1 to 3 seconds.
I don't really know what they means, because I don't know is involved,
and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on my
machine), I set up a comparable test. I took an 800-line C program, and
duplicated it in 200 separate files f1.c to f200.c, for a total line
count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it can't
link due to Win32 imports).
Tiny C took 0.64 seconds.
Here's a typical build for my project:
$ make -s -j 64
...
COMPILE apc2.cpp
COMPILE bcp2.cpp
BUILD /scratch/lib/libdevice_yyy.so
XXXX_BUILT
BUILDSO /scratch/libxxxx.so.1.0
BUILD TARGET /scratch/bin/xxxx
COMPILE shim.cpp
real 13m48.24s
user 1h2m9.78s
sys 0m56.73s
13 minutes by the wall clock, a bit over an hour
of CPU time.
Is that a "typical" build - or an atypical full clean build?
On 04/12/2024 19:47, Janis Papanagnou wrote:
On 04.12.2024 13:08, David Brown wrote:
Sure. And it is certainly /possible/ to know all the small details of C >>> without ever reading the standards. But it's quite unlikely.
The question is (IMO) not so much to know "all" and even "all small"
details. Even in a language like "C" (that I'd consider to be fairly
incoherent if compared to other languages' design) you can get all
"important" language properties (including details) from textbooks.
If cases where that is different, the standards documents - which
have their very own special way of being written - would be even
less comprehensibly as they (inherently) already are. - That said
from a programmer's POV (not from the language implementors').
I look into language standards only if I want to confirm/falsify
an implementation; in this case I'm taking the role of a language
implementor (not a programmer). Personally I do that anyway only
rarely, for specific languages only, and just out of academical
interest.
[...]
Bart is an expert at thinking up things in C that confuse him.
Well, that's an own topic. - Where I was really astonished was the
statement of being confused about the braces/semicolons, which is
so fundamental (and primitive) but technically just a detail that
I'd thought it should be clear
OK, if it's so simple, explain it to me.
Apparently the first line here needs a semicolon after }, the second
doesn't:
int X[1] = {0};
void Y() {}
Similarly here:
if (x) y;
if (x) {}
Why?
"Because that's what the grammar says" isn't a valid answer.
The C language is one of the most quirky ones around full of apparently >ridiculous things. Why shouldn't you be able to write this for example:
{
....
L:
}
This stupid rule means that EVERY label in my generated code needs to be >written as L:; instead of just L:
Please don't say the label is only defined to be a prefix to another >statement.
I asking why it was done like that.
On 04/12/2024 22:31, Scott Lurndal wrote:
/My/ real-world apps started with a general idea and then evolved.
I mean, what would be the formal specification for AutoCAD? (The nearest >product to mine at the time.)
scott@slp53.sl.home (Scott Lurndal) writes:
Bart <bc@freeuk.com> writes:[...]
Do you use a formal grammar when parsing a CSV file, or something
equally trivial?
I wouldn't call CSV trivial, and yes, I'd want to have a precise >specification of the format in front of me if I wanted to implement
it. RFC 4180 seems to be the closest thing we have to a standard,
but it's not universally followed.
I don't parse csv files, there are dozens of tools already
available to perform that action for me with no need
to reinvent the wheel.
for example, cvstool comes with most linux distributions:
https://github.com/maroofi/csvtool
The csvtool that's available for most Linux distributions isn't
that one. It's another tool of the same name, implemented in OCaml,
with a completely different user interface. I don't know whether
they implement the same CSV specification.
https://github.com/Chris00/ocaml-csv
Bart <bc@freeuk.com> writes:
On 04/12/2024 19:23, Janis Papanagnou wrote:
On 04.12.2024 18:43, Bart wrote:
[...]
Do you use a formal grammar when parsing a CSV file, or something
equally trivial?
Is that the reason why there's so many versions around that are
incompatible? CSV-parsing is not "trivial" if you look into the
details; you have to specify these (at first glance not obvious
details) to be sure that your "CSV-data" works not only with your
"CSV-parser" but that there's a common understanding of the CSV-
"language". (It's only few details; delimiters in string values,
escapes, and such, but enough to initiate incompatible formats.)
Yes, of course; if there would have been a formal specification
in the first place we wouldn't have the mess we now actually have.
And if you anyway write your tools only for yourself, and if you
don't intend to exchange data with others, no one cares what you
think a/the "CSV-format" actually is.
But we weren't discussing such comparably simple structures; we
have been discussing programming languages (and their grammars).
And most of us are considering sensible languages, not privately
hacked up toy languages or implementations of personal hobbies.
You really hate toy languages don't you?
Nobody has made that claim other than yourself.
I don't feel that your 'toy' languages are interesting
in the context of comp.lang.c.
The fact is that a compiler is only a bit of software like anything
else. It might take some input, process it, and produce output.
When someone has claimed to write some program, do you always demand
they produce a 'formal grammar' for it?
Now you're being ridiculous.
Intentionally, no doubt, to induce
a response.
What is the complexity threshold anyway for something to need such a
formal specification?
Most real-world working applications start with a formal specification.
As do most other real-world projects whether it is a payroll application
or a massive construction project like Vogtle Unit #3.
snip rant
Bart <bc@freeuk.com> writes:
OK, if it's so simple, explain it to me.
I'll pretend that was a sincere question.
You seem to be under the impression that a closing brace should
either always or never be followed by a semicolon. I don't know
where you got that idea.
Braces ("{", "}") are used in different contexts with different
meanings. They're generally used for some kind of grouping (of
statements, declarations, initializers), but the distinct uses are
distinct, and there's no particular reason for them to follow the
same rules.
Apparently the first line here needs a semicolon after }, the second
doesn't:
int X[1] = {0};
void Y() {}
Yes. The first is a declaration, and a declaration requires a
semicolon. I suppose the language could have a special-case rule
that if a declaration happens to end with a "}", the semicolon is
not required, but that would be silly.
The second is a function definition (and can only appear at file
scope). Function definitions do not require or allow a semicolon
after the closing "}". Why should they?
Similarly here:
if (x) y;
if (x) {}
Why?
"Because that's what the grammar says" isn't a valid answer.
Because that's what the grammar says.
Not all statements require a closing semicolon. In particular,
compound statements do not, likely because the closing
"}" unambiguously marks the end of the statement. Sure, the
language could have been specified to require a semicolon, but why?
(I'll note that languages that use "begin"/"end" rather than "{"/"}"
often require a semicolon after the "end".)
And you can add a semicolon after a compound statement if you like
(it's a null statement),
Of course you know all this.
Please don't say the label is only defined to be a prefix to another
statement. I asking why it was done like that.
The label is only defined to be a prefix to another statement.
It was simple to define it that way, and not particularly
inconvenient to add a semicolon if you happen to want a label at
the end of a block. I'd be surprised if this rule has ever actually
caused you any inconvenience.
But you'll be delighted to know that C23 changed the grammar for a
compound statement, so a label can appear before any of a statement,
a declaration, or the closing "}". So now you have exactly what you
want. (Just kidding; you'll still find a way to be angry about it.)
Bart <bc@freeuk.com> writes:
[...]
My language is intended for whole program compilation. There are no
interfaces between modules (that is, a file listing the exports of a
module, separately from the implementation).
Since you keep telling us about your language,
What is it called? (I'm assuming it has a name.)
Is there an implementation that others can use?
Is there a specification?
Would you consider supporting the creation of a new alt.comp.lang.<name> newsgroup where people who are actually interested in it can discuss it?
On 03.12.2024 16:24, David Brown wrote:
On 03/12/2024 13:34, Bart wrote:
[...]
Of course lots of C programmers have never read the standards.
But they generally know that they don't know these details, and
don't try to pretend that the language details that they don't
know are confusing. They simply get on with the job of writing
clear C code as best they can.
I feel an urge to point out that the C standard is not necessary
to understand concepts that can be read about in textbooks
or inferred (or just tried out, if the available sources are
unclear about some detail).
On 04/12/2024 16:09, Bart wrote:
On 04/12/2024 09:02, David Brown wrote:
On 03/12/2024 19:42, Bart wrote:
Yesterday you tried to give the misleading impression that compiling a
substantial 200Kloc project only took 1-3 seconds with gcc.
No, I did not. I said my builds of that project typically take 1-3
seconds. I believe I was quite clear on the matter.
If I do a full, clean re-compile of the code, it takes about 12 seconds
or so. But only a fool would do that for their normal builds. Are you
such a fool? I haven't suggested you are - it's up to you to say if
that's how you normally build projects.
If I do a full, clean re-compile /sequentially/, rather than with
parallel jobs, it would be perhaps 160 seconds. But only a fool would
do that.
I gave some timings that showed gcc-O0 taking 50 times longer than tcc,
and 150 times longer with -O2.
That is the real picture. Maybe your machine is faster than mine, but I
doubt it is 100 times faster. (If you don't like my benchmark, then
provide another in portable C.)
All this just so you can crap all over the benefits of small, faster,
simpler tools.
Your small, fast, simple tools are - as I have said countless times -
utterly useless to me. Perhaps you find them useful, but I have never
known any other C programmer who would choose such tools for anything
but very niche use-cases.
The real picture is that real developers can use real tools in ways that
they find convenient. If you can't do that, it's your fault. (I don't
even believe it is true that you can't do it - you actively /choose/ not
to.)
And since compile speed is a non-issue for C compilers under most circumstances, compiler size is /definitely/ a non-issue, and
"simplicity" in this case is just another word for "lacking useful
features", there are no benefits to your tools.
On Fri, 29 Nov 2024 13:33:30 +0000
Bart <bc@freeuk.com> wrote:
[C syntax] allows a list of variable names in the same declaration
to each have their own modifiers, so each can be a totally
different type
Not in every context. It is not allowed in function prototypes.
Even when it is allowed, it's never necessary and avoided by
majority of experienced programmers.
I'd guess, TimR will disagree with the last part.
On 04/12/2024 12:29, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
You've never used my scheme.
Your scheme, not. But you should understand that when speaking
about module systems C is an outlier, having almost no support
for modules. Some languages, like C++ and Lisp go half way,
providing namespaces but rest is up to programmer. Other do
more. By now classis is _logical_ separation into interface
and implementation, which seem to be absent from your system.
It is there, but at a different boundary.
My language is intended for whole program compilation. There are no interfaces between modules (that is, a file listing the exports of a
module, separately from the implementation).
Because the compiler can see (and will compile) the actual implementation
Such interface files exist between programs, which usually means between
a main program and the libraries it uses, which are generally
dynamically loaded.
No, I am afraid of cases where function is used without prototype and
then there is conflicting definition later in the module.
Of course, it's UB, but in practice it could often work fine.
Something like that:
static int bar();
int foo(void)
{
return bar(42);
}
static int bar(int a, int b)
{
if (a == 42)
return -1;
return a - b;
}
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 03.12.2024 16:24, David Brown wrote:
Of course lots of C programmers have never read the standards.
But they generally know that they don't know these details, and
don't try to pretend that the language details that they don't
know are confusing. They simply get on with the job of writing
clear C code as best they can.
I feel an urge to point out that the C standard is not necessary
to understand concepts that can be read about in textbooks
That assumes that such textbooks exist, that they can be identified,
located, and obtained without too much difficulty, and don't cost too
much to get.
The C standard is easily located and obtained, and
costs nothing to download (for a draft that is virtually identical
to the actual standard).
or inferred (or just tried out, if the available sources are
unclear about some detail).
Two problems with that suggestion. One, trying to figure out what
the rules are by experimentation is sometimes difficult and
unreliable.
Two, some details, such as what constructs result in
undefined behavior, simply cannot be determined by means of
experimentation.
Also, the idea that one can figure out the rules of the C language
by looking at compiler sources is just laughable.
On 04/12/2024 22:58, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
OK, if it's so simple, explain it to me.
I'll pretend that was a sincere question.
Let's put it this way: if somebody asked me what the rule was, I
wouldn't be able to tell them.
[...]
Consistency? I posted a bit of Algol68 the other day; each function definition needed a semicolon, to separate it from the next.
[...]
I'm aware of how messy and conistent it is. I've said that languages
that use ";" as a separator probably fair better, [...]
[...]
My own made a compromise here: use ";" as separator,
but introduce some
rules so that explicit ";" is rarely need. I think that wins.
I said that my generated code has to use ":;" after each label; it looks weird.
[...][...]
Of course, if it was just me, then it would be pointless ranting. 'How
hard is to add the semicolon?'
Well, 'How hard is it to delete the semicolon' from my above example?
Bart <bc@freeuk.com> writes:
Let's put it this way: if somebody asked me what the rule was, I
wouldn't be able to tell them.
Probably because there's no single rule. As I wrote, "{" and "}"
are used in different contexts with different meanings. [...]
[...]
Consistency? I posted a bit of Algol68 the other day; each function
definition needed a semicolon, to separate it from the next.
C is not Algol68. In C, the syntax of a function definition is such
that they can be appended without ambiguity. Requiring a semicolon
after each definition would not help anything. [...]
On 05.12.2024 00:57, Bart wrote:
On 04/12/2024 22:58, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
OK, if it's so simple, explain it to me.
I'll pretend that was a sincere question.
Let's put it this way: if somebody asked me what the rule was, I
wouldn't be able to tell them.
If that is true I suggest to read one or two textbooks.
Consistency? I posted a bit of Algol68 the other day; each function
definition needed a semicolon, to separate it from the next.
Why do you explicitly name functions here? - Semicolons are a
general (sequentializing) separator in Algol 68, and not really
different from other programming languages.
[...]
I'm aware of how messy and conistent it is. I've said that languages
that use ";" as a separator probably fair better, [...]
(Oh, I thought you said you preferred terminators in languages.)
[...]
My own made a compromise here: use ";" as separator,
(I know that from Awk, where I regularly use that feature.)
but introduce some
rules so that explicit ";" is rarely need. I think that wins.
Would that then be (like in Awk) that you then assume the line-end
as separator? (For a scripting language I think it's okay.
I said that my generated code has to use ":;" after each label; it looks
weird.
It's funny that you put such an emphasis on labels. - Jumps to
labels have academically been long deprecated but, more importantly,
in my practical work I never had a need to use them (or even think
or considering to use them). (I suppose it's how one got socialized
to design and write programs.)
You obviously have problems on a level that other folks don't have.
Janis
On 05.12.2024 02:29, Tim Rentsch wrote:
Also, the idea that one can figure out the rules of the C language
by looking at compiler sources is just laughable.
Sure. (But that wasn't something that I said. By "sources" I did
not mean "source code" but sources of information, books, etc.)
The idea to consider source code quasi as a [syntax-]"defining"
source came from Bart elsethread.
On 04/12/2024 19:23, Janis Papanagnou wrote:
On 04.12.2024 18:43, Bart wrote:
You really hate toy languages don't you?
[...]
When someone has claimed to write some program, do you always demand
they produce a 'formal grammar' for it?
What is the complexity threshold anyway for something to need such a
formal specification?
What input would even be classed as a language: are binary file formats languages? What about the commands of a CLI?
[...]
That's nothing more then a hacker's feeble excuse to justify his
ignorance.
So you're calling me a hacker, and you're calling me ignorant.
Nice.
[...]
My own products were in-house tools to get stuff done. And they worked spectacularly well.
[...]
OK, you're a professional; I'm an amateur (an amateur who managed to
retire at 42). I will give you that. Satisfied?
[...]
For the rest of us, that part is the simplest part of a compiler. You
write it, and move on.
Is that the reason why your languages and compilers are so widespread
used? </sarcasm>
By contrast, the languages that you've devised are in common use of course?
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 17:55, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 03/12/2024 14:34, David Brown wrote:
On 02/12/2024 22:53, Bart wrote:
For the project-specific code, rather than libraries, SDK, etc.:
----------------------------------------------------------------------- >>>> Language files blank comment
code
----------------------------------------------------------------------- >>>> C++ 39 2217 1020
11820 C/C++ Header 44 1078 798
2696 C 3 259
237 1152
----------------------------------------------------------------------- >>>> SUM: 86 3554 2055
15668
------------------------------------------------------------------------
Not that it really matters, but a typical build for my project
takes about 1 to 3 seconds.
I don't really know what they means, because I don't know is
involved, and what is being missed out.
So, to get an idea of how long gcc really takes (or would take on
my machine), I set up a comparable test. I took an 800-line C
program, and duplicated it in 200 separate files f1.c to f200.c,
for a total line count of 167Kloc.
gcc took 30 to 90 seconds to create an EXE using -O0 and -O2.
Under WSL, it took 18 to 54 seconds 'real time' (to .o files; it
can't link due to Win32 imports).
Tiny C took 0.64 seconds.
Here's a typical build for my project:
$ make -s -j 64
...
COMPILE apc2.cpp
COMPILE bcp2.cpp
BUILD /scratch/lib/libdevice_yyy.so
XXXX_BUILT
BUILDSO /scratch/libxxxx.so.1.0
BUILD TARGET /scratch/bin/xxxx
COMPILE shim.cpp
real 13m48.24s
user 1h2m9.78s
sys 0m56.73s
13 minutes by the wall clock, a bit over an hour
of CPU time.
Is that a "typical" build - or an atypical full clean build?
That was a build that touched a key header file, so maybe
85% full. A full build adds a minute or two wall time.
A single source file build (for most source files) takes
about 28 seconds real, a large portion of that in make as it
figures out what to rebuild in a project with thousands
of source files and associated dependencies.
On 04/12/2024 20:55, David Brown wrote:
On 04/12/2024 16:09, Bart wrote:
It's used to avoid cluttering code with ugly xxx:: qualifiers, and
save some typing at the same time. That's pretty much it.
Buy a better keyboard. Saving typing is a pathetic excuse.
How does a better keyboard make this less cluttery:
std::cout << xxx::X << yyy::Y << std::endl
Compare with:
cout << X << Y endl
That seems to be something akin to C++ classes or structs with static
methods. But I'm not sure, and certainly not sure how it is relevant.
You said my namespaces were jumbled. They generally work as people
expect, other than some different rules on name resolution.
Compare for example how filenames are resolved in a file system, when
both absolute and relative paths are missing. Sometimes it will look in
a list of special places (eg. using Windows' PATH variable) if not found locally.
You don't agree with my choices, OK.
Of course tcc won't work - I work with embedded systems, and as I have
said countless times, your little tools don't target the devices I use.
In my hardware days I built boards with embedded processors too.
I programmed them with my 'toy' language. Sometimes I didn't have have a compiler, so a couple of times I wrote an assembler for the device. The assembler (obviously, you will consider that a toy) was written in my
toy language. (One was for 80188, the other was 8035 or 8051.)
My point is that 'toy' implementations /could/ work just as well, and
here I was able to get working prototypes. (At least, until my boss had
some new ideas and I moved on to something else.)
I /need/ good optimisations - it's not an option, it's a requirement
to get fast enough code for the end system to work. So your toys
would not be usable even if they supported the right target.
And I did without an optimising compiler for 20 years. Not that C
compilers were that much better for a lot of that period, of 82-02.
I use C++ as well as
C - your toys don't support that. I use C17 (and will move to C23
when my tools support it) - tcc supports only up to partial C99
support, and $DEITY only knows what partial C standard your compiler
handles.
My C compiler is for my personal use. It will build only a tiny fraction
of existing code, but will compile 100% of the code I write, or generate.
So I'm not asking people to use it. But it shows what is possible.
Bart <bc@freeuk.com> writes:
On 04/12/2024 22:58, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
OK, if it's so simple, explain it to me.
I'll pretend that was a sincere question.
Let's put it this way: if somebody asked me what the rule was, I
wouldn't be able to tell them.
Probably because there's no single rule. As I wrote, "{" and "}"
are used in different contexts with different meanings. (So are
"(", ")", ";", and "static".) I have no particular expectation
that there should be a single rule covering all the cases, just
because they happen to use the same symbol.
A reasonable person would have ackowledged that I've at least
tried to answer your question.
On Wed, 4 Dec 2024 21:01:07 +0000
Bart <bc@freeuk.com> wrote:
The C language is one of the most quirky ones around full of
apparently ridiculous things. Why shouldn't you be able to write
this for example:
{
....
L:
}
This stupid rule means that EVERY label in my generated code
needs to be written as L:; instead of just L:
Please don't say the label is only defined to be a prefix to
another statement. I asking why it was done like that.
No good reasons. It was a mistake of Committee 35 years ago.
I don't know why it was not fixed in 3 subsequent iteration.
On 05/12/2024 10:59, Janis Papanagnou wrote:
On 05.12.2024 00:57, Bart wrote:
On 04/12/2024 22:58, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
OK, if it's so simple, explain it to me.
I'll pretend that was a sincere question.
Let's put it this way: if somebody asked me what the rule was, I
wouldn't be able to tell them.
If that is true I suggest to read one or two textbooks.
That answer suggests we're talking at cross-purposes.
So if the rules really aren't that simple, 100M users of the language
aren't going to find it any easier but wasting time looking this up in textbooks.
That's not going to happen, and it they are casual users of their
compilers, they WILL write semicolons in the wrong place with no errors reported. Until it bites.
Fixing the language is always the best solution. Failing that, make
compilers report extraneous semicolons.
[...]
Both pure separators and pure terminators have issues.
[...]
[ Awk syntax as example ]
What's the difference between a scripting language and any other?
I've always wondered why people think a clean, clear, easy-to-understand syntax is OK for a scripting language, but is no good at all for a
'REAL' programming language.
[...]
I said that my generated code has to use ":;" after each label; it looks >>> weird.
It's funny that you put such an emphasis on labels. - Jumps to
labels have academically been long deprecated but, more importantly,
in my practical work I never had a need to use them (or even think
or considering to use them). (I suppose it's how one got socialized
to design and write programs.)
So you're one of those who never uses 'goto'. OK.
[...]
[...]
That's like some American boosting us how the USA got to the moon using imperial measurements so they don't need metric. [...]
Brings up the thought of how, ASCII has a bunch of control
characters, but generally only a small number of them are used:
\r, \n, \t, \b
\e, \a, \v, \f (sometimes / rarely)
For CSV, we used ',' (a printable ASCII character) for something
that (theoretically) could have used \x1E (Record Separator).
On Wed, 04 Dec 2024 22:48:20 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:
David Brown <david.brown@hesbynett.no> writes:
Is that a "typical" build - or an atypical full clean build?=20
=20
That was a build that touched a key header file, so maybe
85% full. A full build adds a minute or two wall time.
=20
A single source file build (for most source files) takes
about 28 seconds real, a large portion of that in make as it
figures out what to rebuild in a project with thousands
of source files and associated dependencies.
LLVM/clang or gcc?
Sounds like you can benefit both from faster compiler and from faster
make utility.
BTW, can you experiment with -O0 ? What speedup does it provide over
-O2 in the project as big as yours?
On 04/12/2024 22:31, Bart wrote:
And while you are at it, buy a better screen. You were apparently
unable to read the bit where I said that /saving typing/ is a pathetic excuse.
You said my namespaces were jumbled. They generally work as people
expect, other than some different rules on name resolution.
No, they work as /you/ expect. Not "people".
You designed the language
for one person - you. It would be extraordinary if it did not work the
way /you/ expect!
That way they are influenced by lots of
opinions, experiences, desires, and use-cases way beyond what any one
person could ever have.
So the way namespaces work in C++ is something that - by my rough finger-in-the-air estimates - dozens of people were involved in forming
the plans, hundreds of people would have had a chance to comment and
That suggests that the ratio of people who expect identifiers from
namespaces to require namespace qualification unless you /explicitly/
request importing them into the current scope, compared to the people
who expect namespaces to default to a jumble and overwhelm the current
scope, is of the order of millions to one.
Yes - /sometimes/ file lookup uses some kind of path. That happens for specific cases, using explicitly set path lists. Who would be happy
with an OS that when they tried to open a non-existent file "test.txt"
from their current directory in an editor, the system searched the
entire filesystem and all attached disks? When you use the command
"tcc", would you be happy with a random choice - or error message -
because someone else put a different program called "tcc.exe" on a
network drive somewhere?
No, I don't agree with them. Yes, it is your choice for your language.
But you choose to talk about your language - so I can tell you why I
think they are not good design choices.
Your compiler and tcc don't reach ankle-level to gcc, clang or even MSVC
It doesn't show anything useful is possible. No one else wants to
compile the limited subset of C that you want,
nor do they want a C
compiler written in some weird private language.
But it is not an alternative for other people. It is not some kind of
proof that compilers - real compilers for real work - don't have to be
large.
David Brown <david.brown@hesbynett.no> wrote:
On 04/12/2024 16:09, Bart wrote:
On 04/12/2024 09:02, David Brown wrote:
On 03/12/2024 19:42, Bart wrote:
Yesterday you tried to give the misleading impression that compiling a
substantial 200Kloc project only took 1-3 seconds with gcc.
No, I did not. I said my builds of that project typically take 1-3
seconds. I believe I was quite clear on the matter.
Without word "make" it was not clear if you mean full build (say
after checkout from repository). Frequently people talk about re-making
when they mean running make after a small edit and reserve build
for full build. So it was not clear if you claim to have a compile
farm with few hundred cores (so you can compile all files in parallel).
If I do a full, clean re-compile of the code, it takes about 12 seconds
or so. But only a fool would do that for their normal builds. Are you
such a fool? I haven't suggested you are - it's up to you to say if
that's how you normally build projects.
If I do a full, clean re-compile /sequentially/, rather than with
parallel jobs, it would be perhaps 160 seconds. But only a fool would
do that.
Well, when I download a project from internt the first (ant frequently
the only compilation is a full build).
And if build fails, IME
to is much harder to find problem from log of parallel build.
So I frequently run full builds sequentially. Of course, I find
something to do when computer is busy (300sec of computer time
spent on full build is not worth extra 30 seconds to find trouble
in parallel log (and for bigger things _both_ times grow so
conclusion is the same)).
I gave some timings that showed gcc-O0 taking 50 times longer than tcc,
and 150 times longer with -O2.
That is the real picture. Maybe your machine is faster than mine, but I
doubt it is 100 times faster. (If you don't like my benchmark, then
provide another in portable C.)
All this just so you can crap all over the benefits of small, faster,
simpler tools.
Your small, fast, simple tools are - as I have said countless times -
utterly useless to me. Perhaps you find them useful, but I have never
known any other C programmer who would choose such tools for anything
but very niche use-cases.
The real picture is that real developers can use real tools in ways that
they find convenient. If you can't do that, it's your fault. (I don't
even believe it is true that you can't do it - you actively /choose/ not
to.)
And since compile speed is a non-issue for C compilers under most
circumstances, compiler size is /definitely/ a non-issue, and
"simplicity" in this case is just another word for "lacking useful
features", there are no benefits to your tools.
I somewhat disagree. You probaly represent opinion of majority of developers. But that leads to uncontrolled runaway complexity and
bloat.
You clearly see need to have fast and resonably small code
on your targets. But there are also machines like Raspberry Pi,
where normal tools, including compilers can be quite helpful.
But such machines may have rather tight "disc" space and CPU
use corresponds to power consumption which preferably should be
low. So there is some interest and benefits from smaller, more
efficient tools.
OTOH, people do not want to drop all features. And concerning
gcc, AFAIK is is actually a compromise for good reason. Some
other projects are slow and bloated apparenty for no good
reason. Some time ago I found a text about Netscape mail
index file. The author (IIRC Jame Zawinsky) explained how
it fastures ensured small size and fast loading. But in
later developement it was replaced by some generic DB-like
solution leadind to huge slowdown and much higher space
use (apparently new developers were not willing to spent
a litte time to learn how old code worked). And similar
examples are quite common.
And concerning compiler size, I do not know if GCC/clang
developers care. But cleary Debian developers care,
they use shared libraries, split debug info to separate
packages and similar to reduce size.
On 05/12/2024 04:11, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 04/12/2024 16:09, Bart wrote:
On 04/12/2024 09:02, David Brown wrote:
On 03/12/2024 19:42, Bart wrote:
Yesterday you tried to give the misleading impression that compiling a >>>> substantial 200Kloc project only took 1-3 seconds with gcc.
No, I did not. I said my builds of that project typically take 1-3
seconds. I believe I was quite clear on the matter.
Without word "make" it was not clear if you mean full build (say
after checkout from repository). Frequently people talk about re-making
when they mean running make after a small edit and reserve build
for full build. So it was not clear if you claim to have a compile
farm with few hundred cores (so you can compile all files in parallel).
I talk about "building" a project when I build the project - produce the relevant output files (typically executables of some sort, appropriately post-processed).
If I wanted to say a "full clean build", I'd say that. If I wanted to include the time taken to check out the code from a repository, or to download and install the toolchain, or install the OS on the host PC,
I'd say that.
When I am working with code, I edit some files. Then I build the
project. Sometimes I simply want to do the build - to check for static errors, to see the size of code and data (as my targets are usually
resource limited), etc. Sometimes I want to download it to the target
and test it or debug it.
/Why/ would anyone want to do a full clean build of their project?
Bart is not interested in how much time it takes for people to build
their projects. He is interested in "proving" that his tools are
superior because he can run gcc much more slowly than his tools.
He
wants to be able to justify his primitive builds by showing that his
tools are so fast that he doesn't need build tools or project
management. (He is wrong on all that, of course - build tools and
project management is not just about choosing when you need to compile a file.)
file at a time, so that he can come up with huge times for running gcc
on lots of files. This is, of course, madness - multi-core machines
have been the norm for a couple of decades. My cheap work PC has 14
cores and 20 threads - I'd be insane to compile one file at a time
instead of 20 files at a time.
The more a program is used, the more important its efficiency is. Yes,
gcc and clang/llvm developers care about speed. (They don't care much
about disk space.
Few users are bothered about $0.10 worth of disk space.)
On 05/12/2024 15:46, David Brown wrote:
On 05/12/2024 04:11, Waldek Hebisch wrote:
Few users are bothered about $0.10 worth of disk space.)
And again you fail to get point. Disk space could be free, but that
doesn't mean a 1GB or 10GB executable is desirable. It would just be
slow and cumbersome.
Bart <bc@freeuk.com> writes:
On 05/12/2024 15:46, David Brown wrote:
On 05/12/2024 04:11, Waldek Hebisch wrote:
Few users are bothered about $0.10 worth of disk space.)
And again you fail to get point. Disk space could be free, but that
doesn't mean a 1GB or 10GB executable is desirable. It would just be
slow and cumbersome.
Given that all modern systems load text and data pages from the executable dynamically on the first reference[*], I don't see how executable
size (much of which is debug and linker metadata such as
symbol tables) matters at all. Modern SSD's can move data at
better than 1GB/sec, so 'slow' isn't correct either.
On 05/12/2024 18:30, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 05/12/2024 15:46, David Brown wrote:
On 05/12/2024 04:11, Waldek Hebisch wrote:
Few users are bothered about $0.10 worth of disk space.)
And again you fail to get point. Disk space could be free, but that
doesn't mean a 1GB or 10GB executable is desirable. It would just be
slow and cumbersome.
Given that all modern systems load text and data pages from the executable >> dynamically on the first reference[*], I don't see how executable
size (much of which is debug and linker metadata such as
symbol tables) matters at all. Modern SSD's can move data at
better than 1GB/sec, so 'slow' isn't correct either.
So what the hell IS slowing down your build then?
Since converting C code to native code is something that can be done at >10MB/sec per core. Is your final binary 36GB?
The whole matter is confusing to me. David Brown says that build speed
is no problem at all, as it's only a second or two. You post figures of
13 MINUTES. Which is really an hour on one core.
If I say I find these quirks of C confusing, people will have
another go at me.
Sounds like you can benefit both from faster compiler and from faster
make utility.
Perhaps, but 28 seconds isn't really that bad. Particularly compared
with the systems I used forty years ago :-)
<snip linker>
BTW, can you experiment with -O0 ? What speedup does it provide over
-O2 in the project as big as yours?
We typically use -O3, except on a couple of source files which
take over 6 minutes of CPU time to compile, each, when -O3 is
specified.
Bart <bc@freeuk.com> writes:
Since converting C code to native code is something that can be done at
10MB/sec per core. Is your final binary 36GB?
Of course not, I posted the size, it's 7MB.
Bart <bc@freeuk.com> writes:
If I say I find these quirks of C confusing, people will have
another go at me.
People object not because you say some parts of C are confusing but
because you make them out to be more confusing than they are. These
things are not that hard to understand. No one has any doubt that
you could understand them if you made any sort of effort, but you
don't even try. All you do is complain. If you took even one tenth
the time you spend complaining on learning how things work then you
would understand them and could usefully spend the leftover time
doing something of value. As it is you do nothing but waste
everyone's time, including your own.
On 05/12/2024 20:21, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
Since converting C code to native code is something that can be done at
10MB/sec per core. Is your final binary 36GB?
Of course not, I posted the size, it's 7MB.
You posted a 7MB size (and an 8MB one), in a post about hypothetical
1-10GB executables. I didn't realise it was the size of the project that
took 1 hour of CPU time to build.
So that's about 2KB/second of CPU time.
Bart <bc@freeuk.com> writes:
On 05/12/2024 20:21, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
Since converting C code to native code is something that can be done at >>>> 10MB/sec per core. Is your final binary 36GB?
Of course not, I posted the size, it's 7MB.
You posted a 7MB size (and an 8MB one), in a post about hypothetical
1-10GB executables. I didn't realise it was the size of the project that
took 1 hour of CPU time to build.
So that's about 2KB/second of CPU time.
That is a completely meaningless metric.
Bart <bc@freeuk.com> writes:
[...]
This is a discussion of language design. Not one person trying to
understand how some feature works. I could learn how X works, but that
won't help a million others with the same issue.
That's great, if there are a million other users who are confused
about the same things you claim to be confused about, and if you're
actually trying to help them.
Except for a couple of things.
I don't believe there are "a million other users" who are confused
about, to use a recent example, C's rules about whether a "}" needs
to be followed by a semicolon. I've seen code with extraneous
semicolons at file scope, and I mostly blame gcc's lax default
behavior for that. But I've never heard anyone other than you
complain about C's rules being intolerably inconsistent.
And even if those users exist, I don't see you making any effort
to help them, for example by explaining how C actually works.
(Though you do manage to provoke some of us into doing that for you.)
On 04/12/2024 19:47, Janis Papanagnou wrote:
On 04.12.2024 13:08, David Brown wrote:
Sure. And it is certainly /possible/ to know all the small details of C >>> without ever reading the standards. But it's quite unlikely.
The question is (IMO) not so much to know "all" and even "all small"
details. Even in a language like "C" (that I'd consider to be fairly
incoherent if compared to other languages' design) you can get all
"important" language properties (including details) from textbooks.
If cases where that is different, the standards documents - which
have their very own special way of being written - would be even
less comprehensibly as they (inherently) already are. - That said
from a programmer's POV (not from the language implementors').
I look into language standards only if I want to confirm/falsify
an implementation; in this case I'm taking the role of a language
implementor (not a programmer). Personally I do that anyway only
rarely, for specific languages only, and just out of academical
interest.
[...]
Bart is an expert at thinking up things in C that confuse him.
Well, that's an own topic. - Where I was really astonished was the
statement of being confused about the braces/semicolons, which is
so fundamental (and primitive) but technically just a detail that
I'd thought it should be clear
OK, if it's so simple, explain it to me.
Apparently the first line here needs a semicolon after }, the second
doesn't:
int X[1] = {0};
void Y() {}
Similarly here:
if (x) y;
if (x) {}
Why?
"Because that's what the grammar says" isn't a valid answer.
The C language is one of the most quirky ones around full of apparently ridiculous things. Why shouldn't you be able to write this for example:
{
....
L:
}
This stupid rule means that EVERY label in my generated code needs to be written as L:; instead of just L:
Please don't say the label is only defined to be a prefix to another statement. I asking why it was done like that.
Bart <bc@freeuk.com> writes:
while cond();
A do-while, but comment out the 'do', and it's a compound statement
followed by a while loop
You forgot the parentheses around the condition.
This duality of 'while' bothers me.
Noted as yet another thing that bothers you.
Can you explain briefly just what you're trying to accomplish here?
On 05.12.2024 12:54, Bart wrote:
That's not going to happen, and it they are casual users of their
compilers, they WILL write semicolons in the wrong place with no errors
reported. Until it bites.
Semicolons, really, are not the problem - AFAIR, in *none* of
the languages I was engaged with.
On 05/12/2024 14:00, David Brown wrote:
On 04/12/2024 22:31, Bart wrote:
And while you are at it, buy a better screen. You were apparently
unable to read the bit where I said that /saving typing/ is a pathetic
excuse.
I can't type. It's not a bad excuse at all. I have also suffered from
joint problems. (In the 1980s I was sometimes typing while wearing
woolen gloves to lessen the impact. I haven't needed to now; maybe I've slowed down, but I do take more care.)
You said my namespaces were jumbled. They generally work as people
expect, other than some different rules on name resolution.
No, they work as /you/ expect. Not "people".
The concept is so simple that is not much about it for anyone to tweak.
A qualified name looks like this (this also applies to a single identifer):
A.B.C
'A' is a top-level name. It is this that the name-resolution in a
language needs to identify.
Presumably, in your C and C++ languages, all possible top-level names
must be directly visible. Only in C++ with 'using namespace' does it
allow a search for names not directly visible, by looking also in
certain places.
This is where the rules diverge a little from what I do. Given 'CHUMMY' modules X, Y, Z (which I deliberately choose), then each effectively
includes 'using namespace' for each of the other two modules.
So my approach is a little more informal, and more convenient.
You designed the language for one person - you. It would be
extraordinary if it did not work the way /you/ expect!
Yep. And yet, it is not some obscure, exotic monstrosity. It's not that different from most others, and it's not that hard to understand.
That way they are influenced by lots of opinions, experiences,
desires, and use-cases way beyond what any one person could ever have.
A shame it didn't work for C!
So the way namespaces work in C++ is something that - by my rough
finger-in-the-air estimates - dozens of people were involved in
forming the plans, hundreds of people would have had a chance to
comment and
Yeah. Design-by-committee. Just about every feature of C++ is a dog's
dinner of overblown, over-complex features. C++ doesn't do 'simple'.
(Look at the 'byte' type for example; it's not just a name for 'u8'!)
That suggests that the ratio of people who expect identifiers from
namespaces to require namespace qualification unless you /explicitly/
request importing them into the current scope, compared to the people
who expect namespaces to default to a jumble and overwhelm the current
scope, is of the order of millions to one.
They're not imported into the current scope, but an outer scope.
Anything with the same name in the current scope will shadow those imports.
C and C++ both have block scopes, yes? So the potential is there for
nested scopes dozens of levels deep.
Yes - /sometimes/ file lookup uses some kind of path. That happens
for specific cases, using explicitly set path lists. Who would be
happy with an OS that when they tried to open a non-existent file
"test.txt" from their current directory in an editor, the system
searched the entire filesystem and all attached disks? When you use
the command "tcc", would you be happy with a random choice - or error
message - because someone else put a different program called
"tcc.exe" on a network drive somewhere?
That's exactly how Windows works.
I think Linux works like that too:
since tcc.exe is not a local file, and it doesn't use a path, it uses
special rules to locate it. Somebody could mess about with those rules.
So, why don't you always have to type:
/usr/bin/tcc.0.9.28/tcc # or whatever
instead of just 'tcc'?
No, I don't agree with them. Yes, it is your choice for your language.
But you choose to talk about your language - so I can tell you why I
think they are not good design choices.
My design is to allow default 'using namespace'; see above.
That's ... pretty much it.
Your compiler and tcc don't reach ankle-level to gcc, clang or even MSVC
That's good.
I once had to build a garden gate for my house. I decided to make my
own, and ended up with a 6' wooden gate which was exactly what I needed.
If LLVM made gates, theirs would have been 9 miles high. A little
unwieldy, and probably a hazard for air traffic.
It doesn't show anything useful is possible. No one else wants to
compile the limited subset of C that you want,
That's not true. Thiago Adams would find it useful for one. Obviously I do.
Anyone using C an intermediate language would do so. As would anyone
writing C up to C99.
nor do they want a C compiler written in some weird private language.
Weird? There's a lot more weirdness in C!
But it is not an alternative for other people. It is not some kind of
proof that compilers - real compilers for real work - don't have to be
large.
I suspect your prefered compilers wouldn't even run on a Cray-1 supercomputer, perhaps not even dozens of them.
Yet people /were/ running compilers for all sorts of languages on
machines that weren't as powerful. Professional people writing
professional software.
You're effectively calling all those toys.
Meanwhile DB later admits real clean build times of nearly 3 minutes for
one core.
The C language is one of the most quirky ones around full of apparently ridiculous things. Why shouldn't you be able to write this for example:
{
....
L:
}
This stupid rule means that EVERY label in my generated code needs to be written as L:; instead of just L:
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 17:55, Scott Lurndal wrote:
13 minutes by the wall clock, a bit over an hour
of CPU time.
Is that a "typical" build - or an atypical full clean build?
That was a build that touched a key header file, so maybe
85% full. A full build adds a minute or two wall time.
A single source file build (for most source files) takes
about 28 seconds real, a large portion of that in make as it
figures out what to rebuild in a project with thousands
of source files and associated dependencies.
On 05/12/2024 15:46, David Brown wrote:
On 05/12/2024 04:11, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 04/12/2024 16:09, Bart wrote:
On 04/12/2024 09:02, David Brown wrote:
On 03/12/2024 19:42, Bart wrote:
Yesterday you tried to give the misleading impression that compiling a >>>>> substantial 200Kloc project only took 1-3 seconds with gcc.
No, I did not. I said my builds of that project typically take 1-3
seconds. I believe I was quite clear on the matter.
Without word "make" it was not clear if you mean full build (say
after checkout from repository). Frequently people talk about re-making >>> when they mean running make after a small edit and reserve build
for full build. So it was not clear if you claim to have a compile
farm with few hundred cores (so you can compile all files in parallel).
I talk about "building" a project when I build the project - produce
the relevant output files (typically executables of some sort,
appropriately post-processed).
If I wanted to say a "full clean build", I'd say that. If I wanted to
include the time taken to check out the code from a repository, or to
download and install the toolchain, or install the OS on the host PC,
I'd say that.
When I am working with code, I edit some files. Then I build the
project. Sometimes I simply want to do the build - to check for
static errors, to see the size of code and data (as my targets are
usually resource limited), etc. Sometimes I want to download it to
the target and test it or debug it.
/Why/ would anyone want to do a full clean build of their project?
Suppose your full clean build took, if not 0 seconds, but as close as
made no difference. WHY would you waste time mucking with partial builds?
Bart is not interested in how much time it takes for people to build
their projects. He is interested in "proving" that his tools are
superior because he can run gcc much more slowly than his tools.
It IS slower. And mine isn't the fastest either. On one test from 3-4
years ago spanning 30-40 language implementatations, there was about
3000:1 between fastest and slowest expressed as Kloc/sec. My two
languages were 8 and 4 times slower than the fastest.
He wants to be able to justify his primitive builds by showing that
his tools are so fast that he doesn't need build tools or project
management. (He is wrong on all that, of course - build tools and
project management is not just about choosing when you need to compile
a file.)
You keep saying I'm wrong; I don't often say that you're wrong, but here
you are because you keep choosing to disregard certain facts.
One is that a make system DOES do multiple things, but one of those IS applying dependency info to decide which modules to incrementally
compile, when is then linked with existing object files into a binary.
THIS is exactly what my whole-program compiler takes care of. It doesn't eliminate everything else a make system does, but I don't make use of
those myself.
Another is that you refuse to acknowledge genuine use-cases for fast C compilers.
A big one is when C is used an intermediate language for a
compiler.
Then it just wants as fast a translation into runnable code as
possible. It doesn't need detailed analysis of the C code.
It would also be nice for that compiler to be simple enough to bundle as
part of the app. That is not really practical to do with gcc if your intention is a reasonably small and tidy implementation, apart from
edit-run cycles being dominated by the slowness of gcc in producing even indifferent code.
Such intermediate C may present challenges: it might be very busy. It
might be generated as one C source file. That shouldn't matter.
But you seem to be blind to these possibilties, and stubbornly refuse to admit that a product like tcc may actually be a good fit here.
Some options for a backend for a new language might be to interpret the result, but even tcc's terrible mcode would run a magnitude faster at
least.
Thus he also insists on single-threaded builds - compiling one
file at a time, so that he can come up with huge times for running gcc
on lots of files. This is, of course, madness - multi-core machines
have been the norm for a couple of decades. My cheap work PC has 14
cores and 20 threads - I'd be insane to compile one file at a time
instead of 20 files at a time.
What about someone whose workload is 10 times yours? Their builds will
take 10 times as long on the same machine. They would be helped by
faster compilers.
BTW nothings stops my compiler being run on multiple cores too. I assume
that is a function of your 'make' program?
However the non-C one would need to build multiple executables, not
object files. As a whole-program compiller can't easily be split into parallel tasks.
So I'm looking only at raw compilation speed on one core.
Car analogy: if you were developing a fast car, and wanted one that
could do 200 miles in one hour, using 10 different cars each travelling
at 20mph along a different segment of the route wouldn't be what most
people had in mind.
The more a program is used, the more important its efficiency is.
Yes, gcc and clang/llvm developers care about speed. (They don't care
much about disk space.
Ha!
Few users are bothered about $0.10 worth of disk space.)
And again you fail to get point. Disk space could be free, but that
doesn't mean a 1GB or 10GB executable is desirable. It would just be
slow and cumbersome.
Car analogy #2: supposed a 40-tonne truck cost the same as 1-tonne car.
Which do you think would be quicker to drive to the supermarket and back?
My language:
println =a,=b # 13 characters, 0 shifted
(And that space is optional!) In C (assume the existence of stdio.h):
printf("A=%lld B=%f\n",a,b); # ~28 characters, 8 shifted
Enough said.
On 05/12/2024 16:29, Bart wrote:
On 05/12/2024 14:00, David Brown wrote:
On 04/12/2024 22:31, Bart wrote:
And while you are at it, buy a better screen. You were apparently
unable to read the bit where I said that /saving typing/ is a
pathetic excuse.
I can't type. It's not a bad excuse at all. I have also suffered from
joint problems. (In the 1980s I was sometimes typing while wearing
woolen gloves to lessen the impact. I haven't needed to now; maybe
I've slowed down, but I do take more care.)
Being unable to type, on its own, is not a good excuse - it's an
essential skill for productive programmers.
The goal of that last part is to reduce the number of
characters you have to type, rather than the number of characters in the source code.
If I am writing module X and I want to use symbol "foo"
from module Y, then I expect to say explicitly that I am using module Y
(via an "import" or "include" statement), and I expect to say explicitly
that I want "foo" from module Y (as "Y.foo", "Y::foo", "using Y", "using Y.foo", "from foo import Y", or whatever suits the language).
When I read a source file in any language, and I see the use of an
identifier "foo", I want to know where that came from - I want to be
able to find that out without doubt. If it is written "Y.foo", then
there is no doubt. If I see "using Y.foo", there is no doubt - but I'd
You have snatched defeat from the jaws of victory in order to save a
couple of lines at the start of each file - with the result that your language is only suitable for small single-developer programs.
So my approach is a little more informal, and more convenient.
"Informal" is a common theme for your language and your tools.
And
"convenient" means convenient for you alone, not other people, and
certainly not other people working together.
I suppose that is
reasonable enough since you are the only one using your language and
tools - why should you bother making something useful for people who
will never use them? But this is a root cause of why no one thinks your language or your tools would be of any use to them - and why you get so
much backlash when you try to claim that they are better than other
languages or tools.
Yep. And yet, it is not some obscure, exotic monstrosity. It's not
that different from most others, and it's not that hard to understand.
It /is/ obscure and exotic.
No, it is not. You set PATH to the directories you want to use for
binaries - the OS does not search the entire disk or random additional attached filesystems.
No, your design is to /force/ "using namespace" - whether the programmer wants it or not, and without the programmer even identifying the modules
to import. It's your choice - but it's a bad choice.
No, I'm calling tcc and your compiler "toys".
On 2024-12-06, Bart <bc@freeuk.com> wrote:
My language:
println =a,=b # 13 characters, 0 shifted
(And that space is optional!) In C (assume the existence of stdio.h):
printf("A=%lld B=%f\n",a,b); # ~28 characters, 8 shifted
Enough said.
Looks like a cherry-picked example.
How would this (slightly modified) C statement be notated in your language:
printf("a:%lld b:%g\n",a,b);
On 05/12/2024 18:42, Bart wrote:transpiling to C, then compiling C, then the time to compile language X
A big one is when C is used an intermediate language for a compiler.
Why would that be relevant? If you compile language X by first
On 04/12/2024 23:48, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 17:55, Scott Lurndal wrote:
13 minutes by the wall clock, a bit over an hour
of CPU time.
Is that a "typical" build - or an atypical full clean build?
That was a build that touched a key header file, so maybe
85% full. A full build adds a minute or two wall time.
A single source file build (for most source files) takes
about 28 seconds real, a large portion of that in make as it
figures out what to rebuild in a project with thousands
of source files and associated dependencies.
That sounds like your dependency handling in make is not ideal. Store
your dependencies in individual files per source file, where both the
object file and the dependency file itself are dependent on the headers
used by the C file. (The makefile should also have both the object file
and the dependency file dependent on the C source file.)
Thus your build only generates new dependency files when either the C
source file changes, or any of the headers included (recursively) by
that C file change. Change a C file, and it's dependencies will be >re-calculated - but not the dependencies for anything else. Change a
header, and you only need to re-calculate the dependencies for the
source files that actually use it.
On 2024-12-06, Bart <bc@freeuk.com> wrote:
My language:
println =a,=b # 13 characters, 0 shifted
(And that space is optional!) In C (assume the existence of stdio.h):
printf("A=%lld B=%f\n",a,b); # ~28 characters, 8 shifted
Enough said.
Looks like a cherry-picked example.
How would this (slightly modified) C statement be notated in your language:
printf("a:%lld b:%g\n",a,b);
?
Ike Naar <ike@sdf.org> writes:
On 2024-12-06, Bart <bc@freeuk.com> wrote:
My language:
println =a,=b # 13 characters, 0 shifted
(And that space is optional!) In C (assume the existence of stdio.h):
printf("A=%lld B=%f\n",a,b); # ~28 characters, 8 shifted
Enough said.
Looks like a cherry-picked example.
How would this (slightly modified) C statement be notated in your language: >>
printf("a:%lld b:%g\n",a,b);
?
Or
printf("a:%2$8.8lld b:%1$10.2g",b,a);
[...]
The first few languages I wrote substantial code in (before I started
doing my stuff) didn't use semicolons. (That is, Fortran, assembly and a machine-oriented language. Assembly did use semicolons for comments;
that was no problem!)
[...]
On 06/12/2024 00:50, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
This is a discussion of language design. Not one person trying to
understand how some feature works. I could learn how X works, but that
won't help a million others with the same issue.
That's great, if there are a million other users who are confused
about the same things you claim to be confused about, and if you're
actually trying to help them.
Except for a couple of things.
I don't believe there are "a million other users" who are confused
about, to use a recent example, C's rules about whether a "}" needs
to be followed by a semicolon. I've seen code with extraneous
semicolons at file scope, and I mostly blame gcc's lax default
behavior for that. But I've never heard anyone other than you
complain about C's rules being intolerably inconsistent.
And even if those users exist, I don't see you making any effort
to help them, for example by explaining how C actually works.
(Though you do manage to provoke some of us into doing that for you.)
You really think I'm a beginner stumped on some vital question of
whether I should or should not have use a semicolon in a program I'm
writing?
I said I wouldn't be able explain the rules. Otherwise I stumble along
like everyone else.
If I catch sight of this at the top of my screen window:
-------------------------
}
....
is that } ending some compound statement, or is it missing a ";" because
it's initialising some data? Or (if I know I'm outside a function), is
it the end of function definition?
I wouldn't be stumped for long, but it strikes me as odd that you can't
tell just by looking.
I used to be puzzled by this too: 'while' can both start a while
statement, and it can delimit a do-while statement. How is that possible?
Again if you only caught a glimpse like this:
----------------
while (cond);
then /probably/ it is the last line of a do-while, but it could also
legally be a while loop with an empty body. (Maybe temporarily empty, or
the ; is a mistake.)
How about this:
do
{
....
}
while cond();
A do-while, but comment out the 'do', and it's a compound statement
followed by a while loop
This duality of 'while' bothers me.
[...]
On 06.12.2024 02:20, Bart wrote:
I used to be puzzled by this too: 'while' can both start a while
statement, and it can delimit a do-while statement. How is that possible?
A keyword (like 'while') is just a lexical token that can be used in different syntactical contexts.
language designer thinks this is a good idea for his case.)
You may be less confused with using Pascal;
while positive-case do ...
until negative-case do ...
do ... while positive-case
do ... until negative-case
(depending on language with this of similar syntax).
There's nothing wrong using the same keyword for the same type of
condition, rather it's good to have it defined that way.
On 04/12/2024 16:38, Tim Rentsch wrote:
Ben Bacarisse <ben@bsb.me.uk> writes:
Bart <bc@freeuk.com> writes:
On 03/12/2024 11:15, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
...
If I write this
int *A, B[10], C(int);
My compiler tells me that:
A is a local variable [...]
B is a local variable [...]
C is a function with return type of 'i32', taking one
unnamed parameter of type 'i32'.
(Interestingly, it places C into module scope, so the same
declaration can also create names in different scopes!)
That means your compiler is not compiling standard C. In standard C
all entities declared locally have block scope, not file scope.
OK, it can be added to the list of things that make C harder to
compile.
In a 39Kloc C codebase, there were 21,500 semicolons.
This is generated C, so some of those may be following labels!
2000, actually, which still leaves 19,500; a lot of semicolons.
On a preprocessed sql.c test (to remove comments) there were
53,000 semicolons in 85,000 lines.
So in C, they are a very big deal, occuring on every other line.
On 05/12/2024 20:39, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
If I say I find these quirks of C confusing, people will have
another go at me.
People object not because you say some parts of C are confusing but
because you make them out to be more confusing than they are. These
things are not that hard to understand. No one has any doubt that
you could understand them if you made any sort of effort, but you
don't even try. All you do is complain. If you took even one tenth
the time you spend complaining on learning how things work then you
would understand them and could usefully spend the leftover time
doing something of value. As it is you do nothing but waste
everyone's time, including your own.
This is a discussion of language design.
I wouldn't insult.
On 07/12/2024 11:34, Janis Papanagnou wrote:
On 06.12.2024 02:20, Bart wrote:
I used to be puzzled by this too: 'while' can both start a while
statement, and it can delimit a do-while statement. How is that
possible?
A keyword (like 'while') is just a lexical token that can be used in
different syntactical contexts.
Is it common to both start and end a statememt with the same keyword?
(Even with different semantics, if a
language designer thinks this is a good idea for his case.)
You may be less confused with using Pascal;
Now you are demonstrating my point about being treated like a beginner.
And it is exasperating.
This is a point of language design. C's approach works - just.
But it relies on some subtlety.
'while (cond)' both starts a statement, and can
end a statement:
do while(cond) do ; while (cond);
The second 'while' here starts a nested while-loop inside a do-while
loop. Not confusing?
It is to me! I couldn't finish my example as I got
lost (but as it happens, this is valid code, partly by luck).
Of course, if you put aside all other matters and concentrate on the
minutiae of the syntax details, then it is all 'obvious'. I think it
needs to be a lot more obvious.
while positive-case do ...
until negative-case do ...
do ... while positive-case
do ... until negative-case
(depending on language with this of similar syntax).
There's nothing wrong using the same keyword for the same type of
condition, rather it's good to have it defined that way.
The problem here is exactly the same: how do you tell whether the
'while' in 'do ... while' is ending this 'do'-loop, or starting a nested while-loop?
You don't think there is an element of ambiguity here?
Let's take my example again:
do while(cond) do ; while (cond);
And make a small tweak:
do ; while(cond) do ; while (cond);
It still compiles, it is still valid code.
But now it does something
utterly different. (One is two nested loops, the other is two
consecutive loops - I think!)
OK, I get now that people just aren't that bothered about such matters.
Obviously they aren't language designers.
On 07/12/2024 11:34, Janis Papanagnou wrote:
On 06.12.2024 02:20, Bart wrote:
I used to be puzzled by this too: 'while' can both start a while
statement, and it can delimit a do-while statement. How is that
possible?
A keyword (like 'while') is just a lexical token that can be used in
different syntactical contexts.
Is it common to both start and end a statememt with the same keyword?
(Even with different semantics, if a
language designer thinks this is a good idea for his case.)
You may be less confused with using Pascal;
Now you are demonstrating my point about being treated like a beginner.
And it is exasperating.
This is a point of language design. C's approach works - just. But it
relies on some subtlety. 'while (cond)' both starts a statement, and can
end a statement:
do while(cond) do ; while (cond);
The second 'while' here starts a nested while-loop inside a do-while
loop. Not confusing? It is to me! I couldn't finish my example as I got
lost (but as it happens, this is valid code, partly by luck).
Of course, if you put aside all other matters and concentrate on the
minutiae of the syntax details, then it is all 'obvious'. I think it
needs to be a lot more obvious.
while positive-case do ...
until negative-case do ...
do ... while positive-case
do ... until negative-case
(depending on language with this of similar syntax).
There's nothing wrong using the same keyword for the same type of
condition, rather it's good to have it defined that way.
The problem here is exactly the same: how do you tell whether the
'while' in 'do ... while' is ending this 'do'-loop, or starting a nested while-loop?
You don't think there is an element of ambiguity here?
Let's take my example again:
do while(cond) do ; while (cond);
And make a small tweak:
do ; while(cond) do ; while (cond);
It still compiles, it is still valid code. But now it does something
utterly different. (One is two nested loops, the other is two
consecutive loops - I think!)
But I also have to spend time battling endless personal attacks
because I'm perceived as an upstart (even though I'm now receiving
my state pension!).
On 07.12.2024 15:57, Bart wrote:
The two examples need to be written like this to be valid:
do while (cond) ; while (cond);
do ; while (cond) ; while (cond);
So this /is/ two nested loops followed by two consecutive ones. The
difference is that semicolon.
Someone here already called things like that as "condensed" (or some
such), meaning that syntax is strict and has few redundancies; small
changes are meaningful and relevant. (Nothing for sloppy hackers, for
sure.)
[...]
Those suggesting that semicolons are unimportant details of syntax in C
are wrong. [...]
I don't recall; who said that?
The two examples need to be written like this to be valid:
do while (cond) ; while (cond);
do ; while (cond) ; while (cond);
So this /is/ two nested loops followed by two consecutive ones. The difference is that semicolon.
[...]
Those suggesting that semicolons are unimportant details of syntax in C
are wrong. [...]
On 06/12/2024 15:41, David Brown wrote:
On 05/12/2024 16:29, Bart wrote:
On 05/12/2024 14:00, David Brown wrote:
On 04/12/2024 22:31, Bart wrote:
And while you are at it, buy a better screen. You were apparently
unable to read the bit where I said that /saving typing/ is a
pathetic excuse.
I can't type. It's not a bad excuse at all. I have also suffered from
joint problems. (In the 1980s I was sometimes typing while wearing
woolen gloves to lessen the impact. I haven't needed to now; maybe
I've slowed down, but I do take more care.)
Being unable to type, on its own, is not a good excuse - it's an
essential skill for productive programmers.
The goal of that last part is to reduce the number of characters you
have to type, rather than the number of characters in the source code.
My language:
println =a,=b # 13 characters, 0 shifted
(And that space is optional!)
In C (assume the existence of stdio.h):
printf("A=%lld B=%f\n",a,b); # ~28 characters, 8 shifted
Enough said.
If I am writing module X and I want to use symbol "foo" from module
Y, then I expect to say explicitly that I am using module Y (via an
"import" or "include" statement), and I expect to say explicitly that
I want "foo" from module Y (as "Y.foo", "Y::foo", "using Y", "using
Y.foo", "from foo import Y", or whatever suits the language).
When I read a source file in any language, and I see the use of an
identifier "foo", I want to know where that came from - I want to be
able to find that out without doubt. If it is written "Y.foo", then
there is no doubt. If I see "using Y.foo", there is no doubt - but I'd
If you used my language, you can type Y.foo if you want. Nothing stops
you doing that.
And if you were using it, I can make the ability to omit 'Y.' optional.
However, you'd need to get used to a language and compiler that is 'whole-program', with a slightly different approach
You have snatched defeat from the jaws of victory in order to save a
couple of lines at the start of each file - with the result that your
language is only suitable for small single-developer programs.
It is suitable for a vast number of programs. I'm sure it could be used
for multi-developer projects too. After all, don't your projects also
place all project-structure info into /one/ makefile? Which developer
has editing rights on that file?
Who decides what the modules are going to be? If one developer decided
they need a new module, what is the procedure to get that added to the makefile?
On 06/12/2024 17:47, David Brown wrote:
On 05/12/2024 18:42, Bart wrote:
A big one is when C is used an intermediate language for a compiler.
Why would that be relevant? If you compile language X by firsttranspiling to C, then compiling C, then the time to compile language X
is the sum of these. A fast or slow C compiler is no more and no less relevant here than if you were just compiling C.
You don't give do you?!
If your task to get from A to B was split into two, you'd be happy to do
the first part by a fast car, then complete the rest of it on a horse
and cart, for no reason at all?
On 07.12.2024 14:04, Bart wrote:
Just fine, I'd say.
But it relies on some subtlety.
You seem to see ghosts. There's no subtlety; all is clearly defined,
and it's a sensible feature, and consistently implemented.
'while (cond)' both starts a statement, and can
end a statement:
do while(cond) do ; while (cond);
What is this (IMO syntactical wrong "C" code) supposed to do or to
explain?
Your (wrong) second 'do' was indeed confusing! - Why did you provide
a wrong sample to confirm your wrong ideas? - Or is it just your well
known habit to try to confuse other folks as well my making up stupid
things?
Most here (and me too) already acknowledged that "C" is not obvious
to you.
You don't think there is an element of ambiguity here?
There isn't any.
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 23:48, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 17:55, Scott Lurndal wrote:
13 minutes by the wall clock, a bit over an hour
of CPU time.
Is that a "typical" build - or an atypical full clean build?
That was a build that touched a key header file, so maybe
85% full. A full build adds a minute or two wall time.
A single source file build (for most source files) takes
about 28 seconds real, a large portion of that in make as it
figures out what to rebuild in a project with thousands
of source files and associated dependencies.
That sounds like your dependency handling in make is not ideal. Store
your dependencies in individual files per source file, where both the
object file and the dependency file itself are dependent on the headers
used by the C file. (The makefile should also have both the object file
and the dependency file dependent on the C source file.)
We use the standard makedepend support built into GCC. It creates
a .d file for each source file that contains the dependencies
for that source file.
Scale matters, (with two thousand+ source and header files, that's a lot
of 'access', 'read' and 'stat' system calls just to build the dependency graph. On NFS filesystems, which we use in most environments, the make takes even more time.
Thus your build only generates new dependency files when either the C
source file changes, or any of the headers included (recursively) by
that C file change. Change a C file, and it's dependencies will be
re-calculated - but not the dependencies for anything else. Change a
header, and you only need to re-calculate the dependencies for the
source files that actually use it.
In the case above, over 1000 source files were dependent upon that
key project-wide header file which had been modified.
Bart <bc@freeuk.com> writes:
But I also have to spend time battling endless personal attacks
because I'm perceived as an upstart (even though I'm now receiving
my state pension!).
Dictionaries describe upstarts as people who are full of themselves
and dismissive of others. So if you are perceived as an upstart,
it's because you are one.
Bart <bc@freeuk.com> writes:
In a 39Kloc C codebase, there were 21,500 semicolons.
This is generated C, so some of those may be following labels!
2000, actually, which still leaves 19,500; a lot of semicolons.
On a preprocessed sql.c test (to remove comments) there were
53,000 semicolons in 85,000 lines.
So in C, they are a very big deal, occuring on every other line.
Many years ago I read a book called "How to Lie with Statistics".
It should be updated to add this reporting as an example.
On 06/12/2024 19:41, Bart wrote:
My language:
println =a,=b # 13 characters, 0 shifted
Since some of us live in a world with more than one person - indeed,
more than one country - you might like to know that "=" is shifted on
many keyboards.
(And that space is optional!)
The biggest irritation I have with your style, whether in your language
or in C, is the lack of space. It's a big key - it's easy to press.
Good spacing habits (in the broadest sense) is the single biggest factor
to readability of code, text, hand-writing, and pretty much anything else.
The language Forth is not known for being easy to read amongst non-
experts, but one thing it gets right is insisting that there are spaces between tokens.
In C (assume the existence of stdio.h):
printf("A=%lld B=%f\n",a,b); # ~28 characters, 8 shifted
Enough said.
I don't know what you are getting at, so I don't know if enough has been
said or not. If you are trying to suggest that the only thing you think
is important about language syntax is how you print out two variables in
as few keystrokes as possible, then I suppose you have succeeded in that goal, and can therefore rest your case.
On 07/12/2024 16:17, David Brown wrote:
On 06/12/2024 23:26, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 23:48, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/12/2024 17:55, Scott Lurndal wrote:
13 minutes by the wall clock, a bit over an hour
of CPU time.
Is that a "typical" build - or an atypical full clean build?
That was a build that touched a key header file, so maybe
85% full. A full build adds a minute or two wall time.
A single source file build (for most source files) takes
about 28 seconds real, a large portion of that in make as it
figures out what to rebuild in a project with thousands
of source files and associated dependencies.
That sounds like your dependency handling in make is not ideal. Store >>>> your dependencies in individual files per source file, where both the
object file and the dependency file itself are dependent on the headers >>>> used by the C file. (The makefile should also have both the object
file
and the dependency file dependent on the C source file.)
We use the standard makedepend support built into GCC. It creates
a .d file for each source file that contains the dependencies
for that source file.
This is part of the makefile macro I use when generating dependency
files:
$(CCDEP) $(combined_cflags) $(2) $(3) -MM -MP \
-MT $(obj_dir)$(1)$$(@F:%.d=%.o) \
-MT $(dep_dir)$(1)$$(@F:%.o=%.o) \
-MF $$@ $$<
The point is that because the dependency files themselves are also
targets in these dependency files, they don't need to be re-build
unnecessarily.
Scale matters, (with two thousand+ source and header files, that's a lot >>> of 'access', 'read' and 'stat' system calls just to build the dependency >>> graph. On NFS filesystems, which we use in most environments, the make >>> takes even more time.
If the issue is stat'ing the times for those files in order to check
the dependencies, then I understand the bottleneck. But I would not
expect people to be sharing a source tree on NFS like that - it is
certainly not the usual way of collaborating on projects.
Certainly it all sounds like you are doing something seriously
inefficient. You've only got ten times the number of files that I
have, but massively slower update builds.
This is to be lauded. Somebody actually questioning something whether build-times should be that long.
Somebody other than me, that is.
On 07/12/2024 15:57, David Brown wrote:
On 06/12/2024 19:41, Bart wrote:
My language:
println =a,=b # 13 characters, 0 shifted
Since some of us live in a world with more than one person - indeed,
more than one country - you might like to know that "=" is shifted on
many keyboards.
I want to make my comment both fit on one line, and line up with the
other, so I had to leave out '(on my keyword)'. I'd hoped that was implied.
(And that space is optional!)
The biggest irritation I have with your style, whether in your
language or in C, is the lack of space. It's a big key - it's easy to
press.
Spaces were left out to do a fairer comparison. Spaces aren't a big
deal, they are fairly easy to type, I even left one in in my example
because it looked too odd otherwise, and I was still ahead anyway.
Good spacing habits (in the broadest sense) is the single biggest
factor to readability of code, text, hand-writing, and pretty much
anything else.
The language Forth is not known for being easy to read amongst non-
experts, but one thing it gets right is insisting that there are
spaces between tokens.
That is misleading. Forth REQUIRES those spaces otherwise it wouldn't
work; each line would be read as one giant token. It's not because it
wants to instill good habits.
If you wrote C in the same style, it would look pretty weird:
f ( ) ;
f ( x , y , z ) ;
++ p . x ;
q = ( void ) p ;
label :
f ( x , y , z ) ;
a [ i + 1 ] ;
I know this won't cut any ice, as nothing I can possibly write will ever
make the slightest bit of difference, but in C you write stuff like this:
On 07/12/2024 17:52, Bart wrote:
f ( ) ;
f ( x , y , z ) ;
++ p . x ;
q = ( void ) p ;
label :
f ( x , y , z ) ;
a [ i + 1 ] ;
That is just differently bad from "f(x,y,z);" and "q=(void*)p;". Too
much space is bad spacing,
I know this won't cut any ice, as nothing I can possibly write will
ever make the slightest bit of difference, but in C you write stuff
like this:
No, in C /I/ don't write anything of the sort - nor do most C
programmers. And I don't believe you write functions with 8 parameters
in your language either. At the very least, these would be highly unusual.
On 07/12/2024 17:58, David Brown wrote:
On 07/12/2024 17:52, Bart wrote:
f ( ) ;
f ( x , y , z ) ;
++ p . x ;
q = ( void ) p ;
label :
f ( x , y , z ) ;
a [ i + 1 ] ;
That is just differently bad from "f(x,y,z);" and "q=(void*)p;". Too
much space is bad spacing,
You specifically praised Forth for requiring spaces between tokens,
which is exactly what this code has.
<snip purely subjective opinion on unrealistic code>
On 07/12/2024 17:57, Bart wrote:
On 07/12/2024 16:17, David Brown wrote:
On 06/12/2024 23:26, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
This is to be lauded. Somebody actually questioning something whether
build-times should be that long.
Somebody other than me, that is.
There's a big difference, however. I am asking Scott with a view to
helping him get more efficient use out of the good tools he uses.
(Scott's been doing this stuff pretty much since the dawn of time, but >there's still a possibility that there are things he has not thought about.)
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Bart <bc@freeuk.com> writes:[...]
Those suggesting that semicolons are unimportant details of syntax in
C are wrong. They can make a significant difference and contribute to
errors:
for (i=0; i<n; ++i);
printf("%d\n", i);
Nobody has suggested that "semicolons are unimportant details of syntax
in C".
In my experience, semicolons can be confusing for programmers who are
new to programming, or at least new to a language. I remember "missing >semicolon" being a common error when I was a beginner, but I don't
remember it being a real issue any time in the last several decades.
On 07/12/2024 21:00, David Brown wrote:
<snip purely subjective opinion on unrealistic code>
You mean that /real/ example where a function needed a type written 6 times instead of once? OK.
Bart <bc@freeuk.com> writes:
On 07/12/2024 14:36, Janis Papanagnou wrote:
On 07.12.2024 14:04, Bart wrote:
Just fine, I'd say.
But it relies on some subtlety.You seem to see ghosts. There's no subtlety; all is clearly defined,
and it's a sensible feature, and consistently implemented.
'while (cond)' both starts a statement, and canWhat is this (IMO syntactical wrong "C" code) supposed to do or to
end a statement:
do while(cond) do ; while (cond);
explain?
See my follow-up.
Your (wrong) second 'do' was indeed confusing! - Why did you provide
a wrong sample to confirm your wrong ideas? - Or is it just your well
known habit to try to confuse other folks as well my making up stupid
things?
You're being aggressive. It was a mistake, that's all. My original
example was meant to show how it gets confusing, but when I
transcribed it into an actual program, it seemed to work because I'd
left out that 'do'.
OK, it was a mistake. Someone pointed it out. And you took offense at
that for some reason.
David Brown <david.brown@hesbynett.no> writes:
On 07/12/2024 17:57, Bart wrote:
On 07/12/2024 16:17, David Brown wrote:
On 06/12/2024 23:26, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
This is to be lauded. Somebody actually questioning something whether
build-times should be that long.
Somebody other than me, that is.
There's a big difference, however. I am asking Scott with a view to
helping him get more efficient use out of the good tools he uses.
(Scott's been doing this stuff pretty much since the dawn of time, but
there's still a possibility that there are things he has not thought about.)
The problem is not the tools. We know exactly what causes the
slowdown (and it is related to code autogenerated from yaml
data structure descriptions). The algorithms used to initialize
internal data structures seriously stress the GCC optimizer. To
the point that there is a 20:1 difference in compile time
for a handful of source files (up to 8 minutes in one case)
between -O0 and -O3.
Time permitting, the plan is to replace the initial ad-hoc
design with something both scalable and performant.
On 07/12/2024 21:00, David Brown wrote:
<snip purely subjective opinion on unrealistic code>
You mean that /real/ example where a function needed a type written 6
times instead of once? OK.
I'd be curious about how you'd write such a function; maybe all your functions only have one parameter, or they are always of mixed types.
It is suitable for a vast number of programs. I'm sure it could be used
for multi-developer projects too. After all, don't your projects also
place all project-structure info into /one/ makefile? Which developer
has editing rights on that file?
Who decides what the modules are going to be? If one developer decided
they need a new module, what is the procedure to get that added to the makefile?
The design IS better in 100 ways, at least compared to C. Other modern, higher level languages have 100 things I don't have, don't understand,
or would find impossible to use.
However a language at the level of C still appears to be extremely
popular. I guess people like something that is not so intimidating and
that they can feel in command of.
No, it is not. You set PATH to the directories you want to use for
binaries - the OS does not search the entire disk or random additional
attached filesystems.
PATH is a collection of paths that are searched in turn. There it will
stop at the first match.
That's similar to my list of modules that are searched in turn. There it
will find all matches and report an ambiguity if more than one.
However, my list is private to my project, while PATH is common to every command prompt and to whatever directory is current. Some programs could
even manipulate the contents of PATH.
No, your design is to /force/ "using namespace" - whether the programmer
wants it or not, and without the programmer even identifying the modules
to import. It's your choice - but it's a bad choice.
The programmer knows how the language works. They will have chosen the modules that are grouped together.
The alternative is to have 50 modules of a project, where each module contains its own subset of 0 to 49 imports. That's up to 950 possible
imports to maintain across up to 50 files. It sounds a maintenance
nightmare. A dependency graph would need a pretty big sheet of paper!
Would it even allow mutual imports? If not then that's another
nightmare, which is likely to generate lots of extra dummy modules which
only exist to get around the restriction.
I'm sorry, but I tried such a scheme, and decided it was awful.
Bart <bc@freeuk.com> wrote:
It is suitable for a vast number of programs. I'm sure it could be used
for multi-developer projects too. After all, don't your projects also
place all project-structure info into /one/ makefile? Which developer
has editing rights on that file?
There are _many_ possible organizations. Rather typical is one
Makefile _per directory_. Most 'make'-s seem to be able to do
file inclusion, so you may have a single main Makefile that
includes whatever pieces are necessary. For example, if somebody
wanted to add support for your language to gcc, that would need
creating a subdirectory and putting file called 'Make-lang.in'
inside. Make-lang.in is essentially Makefile for your language,
but it is processed to substitute crucial variable and included
as part of gcc Makefile. Of course, you will need more, but
this covers Makefile part.
Who decides what the modules are going to be? If one developer decided
they need a new module, what is the procedure to get that added to the
makefile?
In open source project normal thing is a patch or a Git pull
request. This specifies _all_ needed changes, including Makefile-s
if there is a need for new module it gets added where needed.
I one "my" project there can be multiple module in a file, so
a new module may go to existing file (if it is closely related
to other module in the file) or to a separate file. I also needs
changes to Makefile.
The above is technical part. Other is governace of a project.
If there are multiple developers, then there may be some form
of review. Or a developer may be declared to be maintainer of
given part and allowed changes without review. 'gcc' has several
"global maintainers", they can change each part and approve
all changes. There are maintainers for specific parts, they
can modify parts they are resposnible for and approve changes
there, for other parts they need approval. There are "write
after approval" people who propose changes but are allowed to
merge them only after approval from folks in the first two
groups. There is also a steering body which votes and
make decision after. Some project use some forms of democracy.
For example, all developers may be considerd equal, but
any change need aproval from another developer. In case of
objections there is a vote. Some projects have a "dictator"
of "lead developer" who has final say on anything project-related.
I have no experience in commercial setups, but IIUC normally
firms have rules and responsible persons. Person responible
for a project divides work and possibly delegates responsibility.
Firm or project rules may require review, so that other developer
looks at changes. There could be design meeting before
coding.
[...]
However a language at the level of C still appears to be extremely
popular. I guess people like something that is not so intimidating and
that they can feel in command of.
Hmm, the only language at C level that I know and is popular
is C. Ada, COBOL, Fortran, Modula 2, Pascal, PL/I are languages
that I consider to be at level similar to C. But AFAIK no of them
is really popular and in modern time they aim to be higher-level
than C. Forth is at lower level and really not that popular.
[snip rest]
On 07/12/2024 14:36, Janis Papanagnou wrote:
On 07.12.2024 14:04, Bart wrote:
Just fine, I'd say.
But it relies on some subtlety.
You seem to see ghosts. There's no subtlety; all is clearly defined,
and it's a sensible feature, and consistently implemented.
'while (cond)' both starts a statement, and can
end a statement:
do while(cond) do ; while (cond);
What is this (IMO syntactical wrong "C" code) supposed to do or to
explain?
See my follow-up.
Your (wrong) second 'do' was indeed confusing! - Why did you provide
a wrong sample to confirm your wrong ideas? - Or is it just your well
known habit to try to confuse other folks as well my making up stupid
things?
You're being aggressive.
It was a mistake, that's all. My original
example was meant to show how it gets confusing, but when I transcribed
it into an actual program, it seemed to work because I'd left out that
'do'.
It says something however when I actually believed that that code was
valid, because the compiler appeared to say so.
Most here (and me too) already acknowledged that "C" is not obvious
to you.
Why is it not possible for to acknowledge that some language design
patterns may not be as obvious as others?
According to you, even if some construct can be determined to be
unambiguous in some convoluted grammar, then it must also be 100%
obvious to any human reader?
Is it just to avoid admitting that I might have a point?
You don't think there is an element of ambiguity here?
There isn't any.
So you're a parser and you see this:
do ... while
How do you know whether that 'while' starts a new nested loop or
terminates this one?
What does it depend on; what property of blocks in the language is
needed to make it work? What property of statement separators or
terminators is needed.
In C, it apparently relies on blocks (that is, the statements in a loop
body) being only a single statement, as it is in C. So the pattern is this:
do {...} while ...
do stmt; while ...
do ; while ...
But not however these:
do {...}; while ...
do while ... # this can't be the terminating while.
So it can't work in a syntax which allows N statements in a block:
do s1; s2; s3; while ...
Since it can't tell whether that while is the terminator, or is another nested loop.
Now, you could have given a more measured, respectful reply and pointed
these things out, instead of being condescending and patronising.
You might also have pointed out
that C could have deprecated null
statements consisting of a single ";",
and required the more visible
"{}", as some compilers can be requested to do. Since such a ";" can instroduce very subtle errors that are hard to spot.
That the option exists suggests that some people do have trouble with
it.
But your attitude appears to be the arrogant one that because it is technically unambiguous, then ANYONE should be able to spot such errors.
And if they can't then they should spend more time studying manuals,
choose a different language, or give up coding altogether.
In your book, it's fine to write expressions like this:
a + b & c * d == f < g | h ? i ^ j : k && l
without parentheses, because the C grammar is 100% unambiguous in what
it means.
That is really not helpful.
Bart <bc@freeuk.com> wrote:
It is suitable for a vast number of programs. I'm sure it could be used
for multi-developer projects too. After all, don't your projects also
place all project-structure info into /one/ makefile? Which developer
has editing rights on that file?
There are _many_ possible organizations. Rather typical is one
Makefile _per directory_. Most 'make'-s seem to be able to do
file inclusion, so you may have a single main Makefile that
includes whatever pieces are necessary. For example, if somebody
wanted to add support for your language to gcc, that would need
creating a subdirectory and putting file called 'Make-lang.in'
inside. Make-lang.in is essentially Makefile for your language,
but it is processed to substitute crucial variable and included
as part of gcc Makefile. Of course, you will need more, but
this covers Makefile part.
Who decides what the modules are going to be? If one developer decided
they need a new module, what is the procedure to get that added to the
makefile?
In open source project normal thing is a patch or a Git pull
request. This specifies _all_ needed changes, including Makefile-s
if there is a need for new module it gets added where needed.
I one "my" project there can be multiple module in a file, so
a new module may go to existing file (if it is closely related
to other module in the file) or to a separate file. I also needs
changes to Makefile.
The above is technical part. Other is governace of a project.
If there are multiple developers, then there may be some form
of review.
The design IS better in 100 ways, at least compared to C. Other modern,
higher level languages have 100 things I don't have, don't understand,
or would find impossible to use.
However a language at the level of C still appears to be extremely
popular. I guess people like something that is not so intimidating and
that they can feel in command of.
Hmm, the only language at C level that I know and is popular
is C. Ada, COBOL, Fortran, Modula 2, Pascal, PL/I are languages
that I consider to be at level similar to C. But AFAIK no of them
is really popular and in modern time they aim to be higher-level
than C. Forth is at lower level and really not that popular.
Can you name a language that is at level of C, is popular and
is not C?
The alternative is to have 50 modules of a project, where each module
contains its own subset of 0 to 49 imports. That's up to 950 possible
imports to maintain across up to 50 files. It sounds a maintenance
nightmare. A dependency graph would need a pretty big sheet of paper!
You discard posibility of re-export. Lazy programmer may write
"utility" module that imports every needed interface and re-exports
it. And then in each implementation import this single module.
And yes, import statements need maintenance, IME it was not a
problem as import statements were small part of source. Normally
I would import a module if I need to use at least twice something
from the module and usually number of uses were higher.
And module import tended to change slower than normal code.
Would it even allow mutual imports? If not then that's another
nightmare, which is likely to generate lots of extra dummy modules which
only exist to get around the restriction.
I'm sorry, but I tried such a scheme, and decided it was awful.
One can do better than Turbo Pascal or Modula 2. But IME both
had quite resonable module systems.
instead of separtate interface text just mark exported things.
But explictit imports, more precisely programmer ability to
restrict imports only to those which are explicitely specified
is essential.
Turbo Pascal and I think also Modula 2 distinguish import in
interface form import in implementation. Imports in interface
part must form a tree (no cycles allowed).
Imports in
implementation part allow arbitrary dependencies. One could
allow weaker restrictions, for example during compilation you
may allow patially defined things and only require that there
are no cycles when you try to fill needed information.
For example module A could define a type depending on constant
from module B and module B could define a type depending on
module A.
On 06/12/2024 20:20, Bart wrote:
If your task to get from A to B was split into two, you'd be happy to
do the first part by a fast car, then complete the rest of it on a
horse and cart, for no reason at all?
The comparison was between C to object code (with a real compiler) and
from X to C and then to the object code (using a real compiler). If
your beliefs were true that gcc (and other proper C compilers) are
incredibly slow, why would it make any difference if someone is starting
from X or starting from C? In both cases, compilation would take a long time - C compilation speed is neither more nor less important whether
you are programming in X or C.
And you are the only one so far who finds gcc to be inconveniently slow.
On 07/12/2024 16:06, David Brown wrote:
On 06/12/2024 20:20, Bart wrote:
If your task to get from A to B was split into two, you'd be happy to
do the first part by a fast car, then complete the rest of it on a
horse and cart, for no reason at all?
The comparison was between C to object code (with a real compiler) and
from X to C and then to the object code (using a real compiler). If
your beliefs were true that gcc (and other proper C compilers) are
incredibly slow, why would it make any difference if someone is
starting from X or starting from C? In both cases, compilation would
take a long time - C compilation speed is neither more nor less
important whether you are programming in X or C.
You don't appear to get it.
If you are writing C by hand, then people
like you would want to use a more powerful, and therefore slower,
compiler, that will analyse that C. It can also take care of the many shortcomings in the language.
But if the C has been machine-generated, that analysis is no longer
relevant. Then you may just want the fastest and simplest conversion.
This was the basis of my use-case for a fast and possibly compact compiler.
And you are the only one so far who finds gcc to be inconveniently slow.
On 10/12/2024 14:56, bart wrote:
On 07/12/2024 16:06, David Brown wrote:
On 06/12/2024 20:20, Bart wrote:
If your task to get from A to B was split into two, you'd be happy
to do the first part by a fast car, then complete the rest of it on
a horse and cart, for no reason at all?
The comparison was between C to object code (with a real compiler)
and from X to C and then to the object code (using a real compiler).
If your beliefs were true that gcc (and other proper C compilers) are
incredibly slow, why would it make any difference if someone is
starting from X or starting from C? In both cases, compilation would
take a long time - C compilation speed is neither more nor less
important whether you are programming in X or C.
You don't appear to get it.
No, /you/ don't get it. I did not say that people using language X
don't care about the speed of C compilation. I said it doesn't matter
any more or any less than for people writing C.
If you are writing C by hand, then people like you would want to use a
more powerful, and therefore slower, compiler, that will analyse that
C. It can also take care of the many shortcomings in the language.
But if the C has been machine-generated, that analysis is no longer
relevant. Then you may just want the fastest and simplest conversion.
As I said before, the analysis is needed for good optimisation -
generating static error checking takes very little extra time when you
are optimising. And often people using C as an intermediary language
want optimisation of the C code.
On 10/12/2024 16:16, David Brown wrote:
On 10/12/2024 14:56, bart wrote:
On 07/12/2024 16:06, David Brown wrote:
On 06/12/2024 20:20, Bart wrote:
If your task to get from A to B was split into two, you'd be happy
to do the first part by a fast car, then complete the rest of it on
a horse and cart, for no reason at all?
The comparison was between C to object code (with a real compiler)
and from X to C and then to the object code (using a real compiler).
If your beliefs were true that gcc (and other proper C compilers)
are incredibly slow, why would it make any difference if someone is
starting from X or starting from C? In both cases, compilation
would take a long time - C compilation speed is neither more nor
less important whether you are programming in X or C.
You don't appear to get it.
No, /you/ don't get it. I did not say that people using language X
don't care about the speed of C compilation. I said it doesn't matter
any more or any less than for people writing C.
If you are writing C by hand, then people like you would want to use
a more powerful, and therefore slower, compiler, that will analyse
that C. It can also take care of the many shortcomings in the language.
But if the C has been machine-generated, that analysis is no longer
relevant. Then you may just want the fastest and simplest conversion.
As I said before, the analysis is needed for good optimisation -
And as I said before, optimisation is very often not needed.
generating static error checking takes very little extra time when you
are optimising. And often people using C as an intermediary language
want optimisation of the C code.
That's a benefit of it yes, but there's also a cost.
It depends a lot on how everything works. But if I take an old project
of mine that uses a C transpiler, and apply it to compiler projects that
I won't bore you with, I get these timings:
C compiler: Overall (C compilation) Runtime
cc 0.17 0.08 1.29 secs
tcc 0.27 0.18 1.56
gcc-O0 2.63 2.54 1.34
gcc-O2 12.6 11.7 0.8
gcc-O3 16.8 15.9 0.8
The intermediate C is one file. Transpiling to that C takes 0.09 seconds
in all cases. 'cc' is my newest C compiler (here beating tcc on 'busy' generated code, but that is not typical).
The runtime is that of running on my largest test input, an extreme
test. Most inputs are a fraction of the size but finish too quickly to measure easily.
For comparisons, the timings for my normal compiler, going straight to
EXE and not via C, are:
mm 0.9 ---- 1.03
Using tcc-compiled intermediate C would still give a reasonable
development experience. Using even gcc-O0 100s of times a day would be
very annoying, for zero benefit.
For comparisons, the timings for my normal compiler, going straight to
EXE and not via C, are:
mm 0.9 ---- 1.03
On 12/4/24 15:55, David Brown wrote:
On 04/12/2024 16:09, Bart wrote:...
My remarks have been about the namespace that is created by a module.
I understand that C++ namespaces can be created in other ways, like
classes.
In reference to C++, I have been using the term "namespaces" the way C++
specifically defines the term "namespace". Perhaps I should have given
a reference to that in some way - this is comp.lang.c, after all.
What you are now talking about is what C calls "name spaces" - with a
space. C++ does not use that term in its standards (at least, not in
the C++20 standards I have open here) - it does not have a clear name
for the concept, as far as I have found. (I am not nearly as familiar
with the C++ standards as I am with the C standards, so if this matters
then perhaps someone else can chime in.)
The latest version of the C++ standard that I have is n4860.pdf, dated >2020-03-31. It has two occurrences of the phrase "name space" outside of
the section that descries differences from C:
8.2p1: "Labels have their own name space ..."
15.6p8: "There is one name space for macro names."
All other C++ identifiers reside in the ordinary name space
Those three namespaces all exist in C. In addition, each C struct or
union has a separate name space; C++ does not have the same rule, but >achieves a similar effect by giving those names a distinct scope.
In C struct, union, and enumation tags have their own namespace, in C++
they reside in the ordinary name space.
Half of all the differences between C and C++ that result in textually >identical code having different well-defined behavior in each language
are tied to differences in how name spaces and scopes work. Those
differences are deeply fundamental, and not easily removed.
On 04/12/2024 16:09, Bart wrote:...
My remarks have been about the namespace that is created by a module.
I understand that C++ namespaces can be created in other ways, like
classes.
In reference to C++, I have been using the term "namespaces" the way C++ specifically defines the term "namespace". Perhaps I should have given
a reference to that in some way - this is comp.lang.c, after all.
What you are now talking about is what C calls "name spaces" - with a
space. C++ does not use that term in its standards (at least, not in
the C++20 standards I have open here) - it does not have a clear name
for the concept, as far as I have found. (I am not nearly as familiar
with the C++ standards as I am with the C standards, so if this matters
then perhaps someone else can chime in.)
On 07.12.2024 16:33, Bart wrote:
You're being aggressive.
Am I?
It says something however when I actually believed that that code was
valid, because the compiler appeared to say so.
Huh?
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
You postulate it as if the grammar were convoluted;
So you're a parser and you see this:
do ... while
How do you know whether that 'while' starts a new nested loop or
terminates this one?
Because there's the '...' in between that answers that question.
So you've got it?
On 30/11/2024 18:38, Bart wrote:
It will at least work with more compiles.
And why would that matter? No actual developer would care if their code
can be compiled by your little toy compiler, or even more complete
little tools like tcc. Code needs to work on the compilers that are
suitable for the job - compatibility with anything else would just be a
waste of effort and missing out on useful features that makes the code better.
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
David Brown <david.brown@hesbynett.no> wrote:
On 30/11/2024 18:38, Bart wrote:
It will at least work with more compiles.
And why would that matter? No actual developer would care if their
code can be compiled by your little toy compiler, or even more
complete little tools like tcc. Code needs to work on the
compilers that are suitable for the job - compatibility with
anything else would just be a waste of effort and missing out on
useful features that makes the code better.
You are exagerating and that does not help communication. In this
group there were at least one serious poster claiming to write code
depending only on features from older C standard.
People like this
presumably would care if some "toy" compiler discoverd non-compliance. Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
In may coding I use gcc extentions when I feel that there is
substantial gain. But for significant part of my code I prefer
to portablility, and that may include avoiding features not
supported by lesser compilers. I the past tcc was not able
to compile code which I consider rather ordinary C, and due
to this and lack of support for my main target I did not use
tcc. But tcc improved, ATM I do not know if it is good enough
for me, but it passed initial tests, so I have no reason to
disregard it.
BTW: IME "exotic" tools and targets help with finding bugs.
So even if you do not normally need to compile with some
compiler it makes sense to check if it works.
On Wed, 11 Dec 2024 05:37:15 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
You are exagerating and that does not help communication. In this
group there were at least one serious poster claiming to write code
depending only on features from older C standard.
For some definition of "serious".
People like this
presumably would care if some "toy" compiler discoverd non-compliance.
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
In may coding I use gcc extentions when I feel that there is
substantial gain. But for significant part of my code I prefer
to portablility, and that may include avoiding features not
supported by lesser compilers. I the past tcc was not able
to compile code which I consider rather ordinary C, and due
to this and lack of support for my main target I did not use
tcc. But tcc improved, ATM I do not know if it is good enough
for me, but it passed initial tests, so I have no reason to
disregard it.
BTW: IME "exotic" tools and targets help with finding bugs.
So even if you do not normally need to compile with some
compiler it makes sense to check if it works.
I would think that the main reason for David Brown's absence of
interest in tcc is simply because tcc do not have back ends for
targets that he cares about.
In particular, Arm port appears to be abandoned in 2003, so quite
likely tcc can't generate code that runs on MCUs with ARMv7-M
architecture that happens to be released first in the same year and officially named in the 2004.
On 07.12.2024 16:33, Bart wrote:
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
On 09/12/2024 18:46, Janis Papanagnou wrote:
On 07.12.2024 16:33, Bart wrote:
So you're a parser and you see this:
do ... while
How do you know whether that 'while' starts a new nested loop or
terminates this one?
Because there's the '...' in between that answers that question.
Oh, I see. That answers everything! Including having a '...' that
includes statements starting with 'while'.
I would think that the main reason for David Brown's absence of
interest in tcc is simply because tcc do not have back ends for
targets that he cares about.
In particular, Arm port appears to be abandoned in 2003, so quite
likely tcc can't generate code that runs on MCUs with ARMv7-M
architecture that happens to be released first in the same year and officially named in the 2004.
On 11/12/2024 09:18, Michael S wrote:
On Wed, 11 Dec 2024 05:37:15 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
You are exagerating and that does not help communication. In this
group there were at least one serious poster claiming to write code
depending only on features from older C standard.
For some definition of "serious".
People like this
presumably would care if some "toy" compiler discoverd
non-compliance. Concerning tcc, they have explicit endorsment from
gawk developer: he likes compile speed and says that gawk compiles
fine using tcc.
In may coding I use gcc extentions when I feel that there is
substantial gain. But for significant part of my code I prefer
to portablility, and that may include avoiding features not
supported by lesser compilers. I the past tcc was not able
to compile code which I consider rather ordinary C, and due
to this and lack of support for my main target I did not use
tcc. But tcc improved, ATM I do not know if it is good enough
for me, but it passed initial tests, so I have no reason to
disregard it.
BTW: IME "exotic" tools and targets help with finding bugs.
So even if you do not normally need to compile with some
compiler it makes sense to check if it works.
I would think that the main reason for David Brown's absence of
interest in tcc is simply because tcc do not have back ends for
targets that he cares about.
In particular, Arm port appears to be abandoned in 2003, so quite
likely tcc can't generate code that runs on MCUs with ARMv7-M
architecture that happens to be released first in the same year and officially named in the 2004.
I remember running TCC on both RPi1 (2012) and RPi4 (2019). That
would be ARM32 (some version of ARMv7 I guess; I find ARM model
numbers bewildering).
It's possible I also tried TCC in the ARM64 mode of RPi4.
So it sounds rather unlikely that TCC doesn't support ARM.
David Brown <david.brown@hesbynett.no> wrote:
On 30/11/2024 18:38, Bart wrote:
It will at least work with more compiles.
And why would that matter? No actual developer would care if their code
can be compiled by your little toy compiler, or even more complete
little tools like tcc. Code needs to work on the compilers that are
suitable for the job - compatibility with anything else would just be a
waste of effort and missing out on useful features that makes the code
better.
You are exagerating and that does not help communication.
In this
group there were at least one serious poster claiming to write code
depending only on features from older C standard.
People like this
presumably would care if some "toy" compiler discoverd non-compliance.
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
In may coding I use gcc extentions when I feel that there is
substantial gain. But for significant part of my code I prefer
to portablility,
and that may include avoiding features not
supported by lesser compilers.
I the past tcc was not able
to compile code which I consider rather ordinary C, and due
to this and lack of support for my main target I did not use
tcc. But tcc improved, ATM I do not know if it is good enough
for me, but it passed initial tests, so I have no reason to
disregard it.
BTW: IME "exotic" tools and targets help with finding bugs.
So even if you do not normally need to compile with some
compiler it makes sense to check if it works.
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
<statement>:
...
<selection-statement>
...
<selection-statement>:
...
if ( <expression> ) <statement>
if ( <expression> ) <statement> else <statement>
...
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
it has two possible parsings:
if (E1) <statement> else S2
where <statement> expands to
if (E2) S1
or
if (E1) <statement>
where <statement> expands to
if (E2) S1 else S2
The grammatical ambiguity is resolved by an additional rule in the 'Semantics' section for selection-statement:
3 An else is associated with the lexically nearest preceding if that is
allowed by the syntax.
gcc -Wall will issue a warning for such an ambiguous statement:
warning: suggest explicit braces to avoid ambiguous 'else' [-Wdangling-else]
bart <bc@freeuk.com> wrote:
On 09/12/2024 18:46, Janis Papanagnou wrote:
On 07.12.2024 16:33, Bart wrote:
So you're a parser and you see this:
do ... while
How do you know whether that 'while' starts a new nested loop or
terminates this one?
Because there's the '...' in between that answers that question.
Oh, I see. That answers everything! Including having a '...' that
includes statements starting with 'while'.
C grammar is not LL(1), that is you can not recognize a construct
looking at its first token. You need the whole construct and
sometimes also the following token. And there are also well
known troubles due to type names and "dangling else".
So C
is almost, but not entirely LR(1). But the description, that
is grammar + extra rules is unambigious, not very complex and
it is well-known how to parse C with good efficiency.
On 11/12/2024 09:18, Michael S wrote:
On Wed, 11 Dec 2024 05:37:15 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
You are exagerating and that does not help communication. In this
group there were at least one serious poster claiming to write code
depending only on features from older C standard.
For some definition of "serious".
People like this
presumably would care if some "toy" compiler discoverd non-compliance.
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
In may coding I use gcc extentions when I feel that there is
substantial gain. But for significant part of my code I prefer
to portablility, and that may include avoiding features not
supported by lesser compilers. I the past tcc was not able
to compile code which I consider rather ordinary C, and due
to this and lack of support for my main target I did not use
tcc. But tcc improved, ATM I do not know if it is good enough
for me, but it passed initial tests, so I have no reason to
disregard it.
BTW: IME "exotic" tools and targets help with finding bugs.
So even if you do not normally need to compile with some
compiler it makes sense to check if it works.
I would think that the main reason for David Brown's absence of
interest in tcc is simply because tcc do not have back ends for
targets that he cares about.
In particular, Arm port appears to be abandoned in 2003, so quite
likely tcc can't generate code that runs on MCUs with ARMv7-M
architecture that happens to be released first in the same year and
officially named in the 2004.
I remember running TCC on both RPi1 (2012) and RPi4 (2019). That would
be ARM32 (some version of ARMv7 I guess; I find ARM model numbers bewildering).
It's possible I also tried TCC in the ARM64 mode of RPi4.
So it sounds rather unlikely that TCC doesn't support ARM.
On 11/12/2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
An unambiguous grammar is something quite essential; [...]
[ dangling-else sample ]
Given that the resolution is in the "semantics" section rather than the "syntax" section, it might seem like a grammatical ambiguity. But I
don't think it is, technically - the syntax rules say that the set of
tokens making up "if (E1) if (E2) S1 else S2" are valid syntax. It is
up to the semantics to determine what the code will do here. (And the semantics are unambiguous.)
On 09/12/2024 18:46, Janis Papanagnou wrote:
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
You can easily parse a language in an ambiguous syntax. [...]
You postulate it as if the grammar were convoluted;
My opinion is that it is. Especially its propensity for using
long-winded production terms that look confusingly similar. (Would it
have killed somebody to make the names more distinct?)
So you're a parser and you see this:
do ... while
How do you know whether that 'while' starts a new nested loop or
terminates this one?
Because there's the '...' in between that answers that question.
Oh, I see. That answers everything! Including having a '...' that
includes statements starting with 'while'.
[...]
My opinion about this feature remains the same, that it was a poor
design choice. One of dozens of such choices. An unambiguous grammar
does not by itself make for a good syntax.
You however seem utterly incapable of criticising C.
Could you write an
article all about its quirks and gotchas and flaws? I doubt it.
[...]
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
On 11.12.2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
Yes, the dangling else is a common ambiguity in many programming
languages.
That's why I prefer languages with syntaxes like in Algol 68 or
Eiffel (for example).
On 11/12/2024 17:30, Janis Papanagnou wrote:
On 11.12.2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >>>> An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
Yes, the dangling else is a common ambiguity in many programming
languages.
That's why I prefer languages with syntaxes like in Algol 68 or
Eiffel (for example).
It is easy to avoid in a C-like language - simply require braces on "if" statements, or at the very least, require them when there is an "else" clause.
Most C coding standards and style guides make that requirement
- not because the C compiler sees it as ambiguous, but because humans
often do. (Or they misinterpret it.)
Bart <bc@freeuk.com> writes:
On 07/12/2024 21:00, David Brown wrote:
<snip purely subjective opinion on unrealistic code>
You mean that /real/ example where a function needed a type written 6 times >> instead of once? OK.
I've always wondered why prototypes in C did not simply use the existing syntax for declarations. After all, it was right there in K&R C, just outside the parentheses:
f(m, n, s)
int m, n;
char *s;
{ ... }
could have become
f(int m, n; char *s)
{ ... }
rather than
f(int m, int n, char *s)
{ ... }
Does anyone know if there even /was/ a reason?
Suffice it to say that as far as code generation is concerned, the
Cortex-A devices on the RPi's are completely different from the
Cortex-M devices in microcontrollers. It's like comparing the 80286
with the Z80A.
On 11.12.2024 17:47, David Brown wrote:
On 11/12/2024 17:30, Janis Papanagnou wrote:
On 11.12.2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >>>>> An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
Yes, the dangling else is a common ambiguity in many programming
languages.
That's why I prefer languages with syntaxes like in Algol 68 or
Eiffel (for example).
It is easy to avoid in a C-like language - simply require braces on "if"
statements, or at the very least, require them when there is an "else"
clause.
Yes, sure. But, I can't help, it smells like a workaround.
Most C coding standards and style guides make that requirement
- not because the C compiler sees it as ambiguous, but because humans
often do. (Or they misinterpret it.)
Yes, true. (We had that in our standards, too.)
On Wed, 11 Dec 2024 16:15:25 +0100
David Brown <david.brown@hesbynett.no> wrote:
Suffice it to say that as far as code generation is concerned, the
Cortex-A devices on the RPi's are completely different from the
Cortex-M devices in microcontrollers. It's like comparing the 80286
with the Z80A.
With exception of RPi1, saying so would be an exaggeration.
RPi2 through 5 are technically capable to run Thumb2-encoded user level routines.
May be, mutual incompatibility with MCUs would become true again in the
next generation of RPi. But more likely it would not happen until RPi7.
On 11.12.2024 17:47, David Brown wrote:
On 11/12/2024 17:30, Janis Papanagnou wrote:
On 11.12.2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >>>>> An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
Yes, the dangling else is a common ambiguity in many programming
languages.
That's why I prefer languages with syntaxes like in Algol 68 or
Eiffel (for example).
It is easy to avoid in a C-like language - simply require braces on "if"
statements, or at the very least, require them when there is an "else"
clause.
Yes, sure. But, I can't help, it smells like a workaround.
Most C coding standards and style guides make that requirement
- not because the C compiler sees it as ambiguous, but because humans
often do. (Or they misinterpret it.)
Yes, true. (We had that in our standards, too.)
Janis
On 11/12/2024 16:51, Janis Papanagnou wrote:
On 11.12.2024 17:47, David Brown wrote:
On 11/12/2024 17:30, Janis Papanagnou wrote:
On 11.12.2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
wrote:
An unambiguous grammar is something quite essential; how would you >>>>>> parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
Yes, the dangling else is a common ambiguity in many programming
languages.
That's why I prefer languages with syntaxes like in Algol 68 or
Eiffel (for example).
It is easy to avoid in a C-like language - simply require braces on "if" >>> statements, or at the very least, require them when there is an "else"
clause.
Yes, sure. But, I can't help, it smells like a workaround.
Most C coding standards and style guides make that requirement
- not because the C compiler sees it as ambiguous, but because humans
often do. (Or they misinterpret it.)
Yes, true. (We had that in our standards, too.)
So here you finally acknowledge there may be ambiguity from a human perspective.
But when I try to make that very point, it's me making up unrealistic examples; I'm being deliberately obtuse; I clearly find things confusing
that no one else has a problem with; or you make snarky comments like this:
"I mean, if you get confused by an unambiguous syntaxes already,
what do you think happens with people if they have to program
in or understand an ambiguous language!"
It's astonishing how I have to fight across a dozen posts to back up my
point of view (you even specifically said you didn't recognise my point).
And yet here: somebody makes that very same point, and you merely say:
"Yes, true."
It really is extraordinary.
This also comes up with 'while (cond); {...}'.
On 11/12/2024 18:20, bart wrote:
On 11/12/2024 16:51, Janis Papanagnou wrote:
On 11.12.2024 17:47, David Brown wrote:
Most C coding standards and style guides make that requirement
- not because the C compiler sees it as ambiguous, but because humans
often do. (Or they misinterpret it.)
Yes, true. (We had that in our standards, too.)
So here you finally acknowledge there may be ambiguity from a human
perspective.
I don't think anyone has ever disagreed that you can write "if"
statements with "else" in C in a way that is confusing or unclear to
human readers. (But it still has fixed and unambiguous meaning to the compiler.)
Equally, of course, it is possible to write them in a way that is not confusing or unclear at all.
But there is no ambiguity in the language itself - only possible
confusions from the way the language could be used. (And since the
language is well defined here, it doesn't make sense to call it an
ambiguity at all, even "from a human perspective". Call it unclear, confusing, or deceptive code if you like.)
I don't see that anyone has changed their position on this.
But when I try to make that very point, it's me making up unrealistic
examples; I'm being deliberately obtuse; I clearly find things
confusing that no one else has a problem with; or you make snarky
comments like this:
Sure.
We all know it is possible to write unclear and confusing C code.
There's a whole competition devoted to it!
But we also all know that most C programmers - like programmers of any
other language - usually avoid writing code they find unclear. (Clarity
is quite subjective - a solid proportion of programmers will agree
roughly on a set of guidelines, but there will be outliers with very different positions. This is independent of language.)
"I mean, if you get confused by an unambiguous syntaxes already,
what do you think happens with people if they have to program
in or understand an ambiguous language!"
It's astonishing how I have to fight across a dozen posts to back up
my point of view (you even specifically said you didn't recognise my
point).
And yet here: somebody makes that very same point, and you merely say:
"Yes, true."
It really is extraordinary.
It was a totally different point. It is extraordinary that you can't
see that.
You claimed the language and its grammar was ambiguous and confusing. It
is not.
Why is it not possible for to acknowledge that some language design
patterns may not be as obvious as others?
According to you, even if some construct can be determined to be
unambiguous in some convoluted grammar, then it must also be 100%
obvious to any human reader?
Who said that? - Again you make up things just for your argument.
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
To what ([hypothetical] "some") grammar are you referring to?
If you mean the "C" grammar; what, concretely, you find to be
"convoluted"?
You postulate it as if the grammar were convoluted; more likely
it's just your personal problem with understanding formal syntax.
No one said, that everything is "100% obvious". An unambiguous
grammar is a precondition for for an understanding, though.
If you'd have your excerpts and samples formatted in a common
(non-lunatic) way then it should have been "100% obvious" even
to you.
Is it just to avoid admitting that I might have a point?
(Yet you haven't had one.)
On 07/12/2024 11:34, Janis Papanagnou wrote:
On 06.12.2024 02:20, Bart wrote:
I used to be puzzled by this too: 'while' can both start a whileA keyword (like 'while') is just a lexical token that can be used in
statement, and it can delimit a do-while statement. How is that possible? >>
different syntactical contexts.
Is it common to both start and end a statememt with the same keyword?
(Even with different semantics, if a
language designer thinks this is a good idea for his case.)
You may be less confused with using Pascal;
Now you are demonstrating my point about being treated like a beginner.
And it is exasperating.
This is a point of language design. C's approach works - just. But it
relies on some subtlety. 'while (cond)' both starts a statement, and can
end a statement:
do while(cond) do ; while (cond);
The second 'while' here starts a nested while-loop inside a do-while
loop. Not confusing? It is to me! I couldn't finish my example as I got
lost (but as it happens, this is valid code, partly by luck).
On 11/12/2024 18:22, Michael S wrote:
On Wed, 11 Dec 2024 16:15:25 +0100
David Brown <david.brown@hesbynett.no> wrote:
Suffice it to say that as far as code generation is concerned, the
Cortex-A devices on the RPi's are completely different from the
Cortex-M devices in microcontrollers. It's like comparing the
80286 with the Z80A.
With exception of RPi1, saying so would be an exaggeration.
RPi2 through 5 are technically capable to run Thumb2-encoded user
level routines.
I did not know that the 64-bit Cortex-A devices could run 32-bit Thumb2-encoded instructions. That does reduce the difference
somewhat.
Compilers targeting 64-bit ARM would generate 64-bit (AArch64)
instructions, which are of course significantly different.
But
perhaps a compiler targeting 32-bit Cortex-A devices might be able to generate Thumb2 instructions.
However, you would not expect binary compatibility between code
generated for a 32-bit Cortex-M device and a Cortex-A platform, even
if it supports Thumb2 instructions - you have major differences in
the ABI, memory layouts, and core features beyond the basic registers.
May be, mutual incompatibility with MCUs would become true again in
the next generation of RPi. But more likely it would not happen
until RPi7.
On 11/12/2024 09:18, Michael S wrote:
On Wed, 11 Dec 2024 05:37:15 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
I remember running TCC on both RPi1 (2012) and RPi4 (2019). That would
be ARM32 (some version of ARMv7 I guess; I find ARM model numbers bewildering).
It's possible I also tried TCC in the ARM64 mode of RPi4.
So it sounds rather unlikely that TCC doesn't support ARM.
bart <bc@freeuk.com> writes:
[...]
You need input from a symbol table in order to parse C, a table that
the parser needs to maintain as it processes source code. That will
tell you whether a particular identifier is a typename or not.
Yes. (I've mentioned this a number of times.)
There are issues also with keywords like 'break'.
What issues?
If you're referring to the fact that break can apply either to a loop or
to a switch, that's a potential source of confusion, but it shouldn't be
a problem once you're aware of it.
bart <bc@freeuk.com> writes:
Are you bothered by the fact that "break" applies to all three
kinds of iteration statements (while, do, for)?
Sure, the fact that "break" is overloaded can be slightly
inconvenient. So can the fact that "break" only exits the
*innermost* enclosing loop or switch.
Using two different keywords (which, to be clear, is unlikely to
happen in any language called "C") would only solve a small part of
the problem. What I'd really like to see is a mechanism for "break"
(or "continue") to apply to a *named* enclosing construct, something
I've found very useful in other languages that support it. It would
provide an easy solution to the scenario you mentioned above.
Bart <bc@freeuk.com> wrote:
On 07/12/2024 11:34, Janis Papanagnou wrote:
On 06.12.2024 02:20, Bart wrote:
I used to be puzzled by this too: 'while' can both start a whileA keyword (like 'while') is just a lexical token that can be used in
statement, and it can delimit a do-while statement. How is that possible? >>>
different syntactical contexts.
Is it common to both start and end a statememt with the same keyword?
Technically, 'do' loop ends in a semicolon.
Concerning "start and end a statememt with the same keyword",
this is clearly common when statement consist of a single
symbol, like empty statement in many langages, including C and
Algol68
I think it was already noted that this is not a valid C statement.
do while(cond) ; while (cond);
is valid, and indeed may be confusing to humans. But people
would normally write it as
do {
while(cond) ;
} while (cond);
or even
do {
while(cond) {
;
}
} while (cond);
to stress that inner loop is "do nothing" loop. In this form
structure is clear.
If you want human-oriented rule for parsing badly written code,
you may notice that body of a C loop can not be syntactually
empty, you need at least a semicolon. So first 'while'
after 'do' must start an inner loop.
Concerning "subtle", note that changing say '*' to '+' can
signifcantly change parse tree and semantics in almost any
programming language. Most people are not bothered by this
and take care when writing/reading operators. Similarly
C programmers are careful when writing/reading semicolons.
You can design a laguage with more redundancy. For example
one of languages that I use has
if cond then expr1 else expr2 endif
while cond do expr endwhile
repeat expr endrepeat
On 11.12.2024 16:28, David Brown wrote:
On 11/12/2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >>>>
An unambiguous grammar is something quite essential; [...]
[ dangling-else sample ]
Given that the resolution is in the "semantics" section rather than the
"syntax" section, it might seem like a grammatical ambiguity. But I
don't think it is, technically - the syntax rules say that the set of
tokens making up "if (E1) if (E2) S1 else S2" are valid syntax. It is
up to the semantics to determine what the code will do here. (And the
semantics are unambiguous.)
I'm a bit ambivalent about that. - Yes, technically it's syntax, it's syntactically probably correct, and it has a semantical ambiguity that
needs to be resolved. All languages with the dangling-else property do resolve that. But the syntax could have been defined in a way that such
that a dangling else cannot appear in the first place. (Not in "C" and descendant languages, of course; that ship has sailed.)
On 11/12/2024 16:51, Janis Papanagnou wrote:
[ dangling else ]
So here you finally acknowledge there may be ambiguity from a human perspective.
[...]
James Kuyper <jameskuyper@alumni.caltech.edu> writes:...
On 12/4/24 15:55, David Brown wrote:
On 04/12/2024 16:09, Bart wrote:
In C struct, union, and enumation tags have their own namespace, in C++
they reside in the ordinary name space.
In C++ they reside in the ordinary name space only if they're not
part of a named namespace.
namespace x {
int y;
};
On 12/10/24 20:32, Scott Lurndal wrote:
[...]
You're confusing name spaces and namespaces.
[...]
On Wed, 11 Dec 2024 21:35:06 +0100
David Brown <david.brown@hesbynett.no> wrote:
On 11/12/2024 18:22, Michael S wrote:
On Wed, 11 Dec 2024 16:15:25 +0100
David Brown <david.brown@hesbynett.no> wrote:
Suffice it to say that as far as code generation is concerned, the
Cortex-A devices on the RPi's are completely different from the
Cortex-M devices in microcontrollers. It's like comparing the
80286 with the Z80A.
With exception of RPi1, saying so would be an exaggeration.
RPi2 through 5 are technically capable to run Thumb2-encoded user
level routines.
I did not know that the 64-bit Cortex-A devices could run 32-bit
Thumb2-encoded instructions. That does reduce the difference
somewhat.
Support for T32 in ARMv8-A is optional.
In practice, majority of "two-digit" Cortex-A devices support it. There
are only two exceptions - A34 and A65, both rather obscure.
"Three-digit", i.e. ARMv9 Cortex-A cores are another story. Most of them either do not support T32 (and A32) at all, or make it optional for implemetator.
On the other hand, nearly all Arm Inc. server cores, even those that
are very closely related to Cortex-A cores, like Neoverse N1 (a variant
of Cotex-A76) do *not* support aarch32.
Compilers targeting 64-bit ARM would generate 64-bit (AArch64)
instructions, which are of course significantly different.
Understatement detected.
But
perhaps a compiler targeting 32-bit Cortex-A devices might be able to
generate Thumb2 instructions.
Majority of compilers are. But it seems that tcc is not.
Or may be it is. tcc docs are too sparse.
However, you would not expect binary compatibility between code
generated for a 32-bit Cortex-M device and a Cortex-A platform, even
if it supports Thumb2 instructions - you have major differences in
the ABI, memory layouts, and core features beyond the basic registers.
There are major difference in floating-point parts of the ABI and in everything related to interrupts. But for integer, it looks like the
T32 ABI is the same.
May be, mutual incompatibility with MCUs would become true again in
the next generation of RPi. But more likely it would not happen
until RPi7.
On Wed, 11 Dec 2024 21:19:54 +0000
bart <bc@freeuk.com> wrote:
This also comes up with 'while (cond); {...}'.
$ cat foo.c
void foo(int x)
{
while (x--);
bar();
}
$ clang-format < foo.c
void foo(int x) {
while (x--)
;
bar();
}
Do I use clamg-format myself? No, I don't.
And I don't know why C formatters are generally less popular among C programmers then, for example, go formater among Go programmers or Rust formatter among Rust programmers.
bart <bc@freeuk.com> writes:
[...]
My experience of multi-level break is that there are two main use-cases:
* Used in the current loop only (not necessarily the innermost to an
observer). This is the most common
* Used to exit the outermost loop
So to support these, named or even numbered loops are not
necessary. (Eg. I use 'exit' or 'exit all'.)
I would oppose a change to C that only applied to innermost and
outermost loops. For one thing, I'm not aware of any other language
that does this (except perhaps your unnamed one). For another,
it's easy enough to define a feature that handles any arbitrary
nesting levels, by applying names (labels) to loops.
Having named labels do have some advantages, such as being absolute
while indices are relative. But sometimes you need an absolute
reference when refactoring and sometimes you want it relative.
I'm not sure what you mean by "relative". Do you have an example?
If, say, I have a "break 2" statement (the Bourne shell uses that
syntax), and I wanted to refactor the code, I'll at least have
to pay very close attention to the number, and likely change it.
Is that what you mean by "relative"? If I have "break NAME", it's
more likely that I won't have to change it. (Disclaimer: I don't
think I've ever used the "break N" feature of the Bourne shell.)
If duplicating within the same function, then you also need to think
about scope rules for those named labels.
That's hardly the only case where duplicating code within a function
can cause conflicts.
A common proposal is to use existing labels, whose scope is already
well defined. Labels have *function scope*. You just have to make
sure that all labels within a function are unique.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]
For (yet another) example; my K&R shows a syntax for expressions like
expression := binary
binary := binary + binary
binary := binary * binary
That's odd. Is that an exact quotation?
[...]
My copy of K&R 1st edition (1978) has, among other things:
multiplicative-expression:
additive-expression:
On 11.12.2024 03:21, bart wrote:
On 09/12/2024 18:46, Janis Papanagnou wrote:
An unambiguous grammar is something quite essential; how would you
parse code if it were ambiguous?
You can easily parse a language in an ambiguous syntax. [...]
Sure; I recall Fortran had such an ambiguity, I think it was in
context of FOR loops (something with the assignment and commas,
IIRC). - Whether it's "easily [to] parse" is arguable, though,
and certainly depends. - But I don't recall to have seen scary
things like that in any other language I had to do with.
Experienced language developers wouldn't define an ambiguous
syntax in the first place. So what do you think your statement
contributes to desire to have an "unambiguous syntax"?
I mean, if you get confused by an unambiguous syntaxes already,
what do you think happens with people if they have to program
in or understand an ambiguous language!
You postulate it as if the grammar were convoluted;
My opinion is that it is. Especially its propensity for using
long-winded production terms that look confusingly similar. (Would it
have killed somebody to make the names more distinct?)
(I won't comment on your opinion.)
With respect to distinctness of names I thought we had already
exchanged some words - remember my 'while' and 'until' samples
from Pascal (that somehow offended you)?
Concerning "C"; I don't see why you shouldn't name a "positive"
conditioned control construct consistently(!) as 'while' whether
it controls a loop entry or a loop exit, and to name a "negative"
conditioned control construct consistently(!) as 'until'.
On 11/12/2024 23:26, Michael S wrote:
On Wed, 11 Dec 2024 21:35:06 +0100
David Brown <david.brown@hesbynett.no> wrote:
On 11/12/2024 18:22, Michael S wrote:
On Wed, 11 Dec 2024 16:15:25 +0100
David Brown <david.brown@hesbynett.no> wrote:
Suffice it to say that as far as code generation is concerned,
the Cortex-A devices on the RPi's are completely different from
the Cortex-M devices in microcontrollers. It's like comparing
the 80286 with the Z80A.
With exception of RPi1, saying so would be an exaggeration.
RPi2 through 5 are technically capable to run Thumb2-encoded user
level routines.
I did not know that the 64-bit Cortex-A devices could run 32-bit
Thumb2-encoded instructions. That does reduce the difference
somewhat.
Support for T32 in ARMv8-A is optional.
In practice, majority of "two-digit" Cortex-A devices support it.
There are only two exceptions - A34 and A65, both rather obscure.
Do you have any thoughts on why it is supported at all? Who would
use it? I can see support for 32-bit ARM code being of interest on
64-bit Cortex-A devices for backwards compatibility - you can run old
RPi 1 binaries on an RPi 4, for example. But where would you want to
run Thumb2 binaries on a 64-bit Cortex-A device - especially when you
could not do so on 32-bit Cortex-A predecessors.
"Three-digit", i.e. ARMv9 Cortex-A cores are another story. Most of
them either do not support T32 (and A32) at all, or make it
optional for implemetator.
On the other hand, nearly all Arm Inc. server cores, even those
that are very closely related to Cortex-A cores, like Neoverse N1
(a variant of Cotex-A76) do *not* support aarch32.
OK - thanks for that information. My ARM work has primarily been
with 32-bit Cortex-M devices (so "ARM architectures" like ARMv7-M and ARMv8-M). 64-bit microcontrollers are not common, as yet. (There is
a 64-bit version of ARMv8-R, but not of the -M architectures.) Maybe
we'll be on RISC-V before 64-bit microcontrollers fit the needs of
our customers.
I've used 64-bit Cortex-A devices in embedded Linux systems, but I've
not needed to consider the details of the architecture as you
typically use these at a higher level of abstraction (at least for
usermode code, which is all I have done there).
Compilers targeting 64-bit ARM would generate 64-bit (AArch64)
instructions, which are of course significantly different.
Understatement detected.
But
perhaps a compiler targeting 32-bit Cortex-A devices might be able
to generate Thumb2 instructions.
Majority of compilers are. But it seems that tcc is not.
Or may be it is. tcc docs are too sparse.
I don't think there would be much reason to generate Thumb2 code
unless it was for running on microcontrollers. So it would be a lot
of work for a compiler developer for little purpose. I don't know if
tcc targeted 32-bit ARM or 64-bit ARM.
If it is the former, then
Thumb2 would be less effort to implement since it is mostly a
different encoding of the same instruction set - but it would be of
little use since (if I understand you correctly) 32-bit Cortex-A
devices don't support Thumb2.
And if tcc supports 64-bit ARM, then
the Thumb2 generation would be much more work since it is a
significantly different ISA. And again, how many people actually
want Thumb2 binaries for their 64-bit Cortex-A devices?
However, you would not expect binary compatibility between code
generated for a 32-bit Cortex-M device and a Cortex-A platform,
even if it supports Thumb2 instructions - you have major
differences in the ABI, memory layouts, and core features beyond
the basic registers.
There are major difference in floating-point parts of the ABI and in everything related to interrupts. But for integer, it looks like the
T32 ABI is the same.
I'm sure much of it is the same (after all, it is solving the same
basic problem), but the details are critical to making things work.
As well as the points you made, I would guess there are differences
to the way code and data is addressed - on Linux, you expect dynamic link/loading, and typically have some level of indirection so that it
can handle address-space randomisation, linking to dynamic libraries,
etc. On microcontrollers, code is normally compiled and linked for a
fixed static memory layout.
May be, mutual incompatibility with MCUs would become true again
in the next generation of RPi. But more likely it would not happen
until RPi7.
On 11.12.2024 18:20, bart wrote:
On 11/12/2024 16:51, Janis Papanagnou wrote:
[ dangling else ]
So here you finally acknowledge there may be ambiguity from a human
perspective.
From a semantical perspective, not from a syntactical. There's rules
to disambiguate a dangling-else. And there's languages that designed
a syntax without an inherent semantical ambiguity (that would need to
be cleared by such rules).
For (yet another) example; my K&R shows a syntax for expressions like
expression := binary
binary := binary + binary
binary := binary * binary
An actual expression x = 2 + 3 * 4 would be "ambiguous" (without precedence rules).
(What a waste of time!)
On 11/12/2024 06:37, Waldek Hebisch wrote:
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
Is that so that you can say I am wrong to claim "no one" cares about tcc support, because you have found 1 person who has used it? I admit it -
"no one" was an exaggeration.
On 11/12/2024 14:39, Waldek Hebisch wrote:
C grammar is not LL(1), that is you can not recognize a construct
looking at its first token. You need the whole construct and
sometimes also the following token. And there are also well
known troubles due to type names and "dangling else".
I can't call it 'context sensitive' because that is a technical
term with a specific meaning, [...]
On 11.12.2024 16:03, David Brown wrote:
On 11/12/2024 06:37, Waldek Hebisch wrote:
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
Who was that?
What I find documented in the GNU Awk package was this:
_The Tiny C Compiler, 'tcc'_
This compiler is _very_ fast, but it produces only mediocre code.
It is capable of compiling 'gawk', and it does so well enough that
'make check' runs without errors.
However, in the past the quality has varied, and the maintainer has
had problems with it. He recommends using it for regular
development, where fast compiles are important, but rebuilding with
GCC before doing any commits, in case 'tcc' has missed
something.(1)
[...]
(1) This bit the maintainer once.
That doesn't quite sound like the GNU Awk folks would think it's a good
tool or anything even close ("mediocre code", "well enough", "runs
without errors", "quality has varied", "had problems with it") And that
it's obviously not trustworthy given the suggestion: "rebuilding with
GCC before doing any commits".
And I cannot find any statement that "he likes compile speed", he just
stated that it is very fast (which seems to just have astonished him).
On 11/12/2024 16:51, Janis Papanagnou wrote:
On 11.12.2024 17:47, David Brown wrote:
On 11/12/2024 17:30, Janis Papanagnou wrote:
On 11.12.2024 09:43, Ike Naar wrote:
On 2024-12-09, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote: >>>>>
An unambiguous grammar is something quite essential; how would you >>>>>> parse code if it were ambiguous?
Here's an ambiguity in the C grammar:
[...]
The following selection-statement is grammatically ambiguous:
if (E1) if (E2) S1 else S2
Yes, the dangling else is a common ambiguity in many programming
languages.
That's why I prefer languages with syntaxes like in Algol 68 or
Eiffel (for example).
It is easy to avoid in a C-like language - simply require braces on "if" >>> statements, or at the very least, require them when there is an "else"
clause.
Yes, sure. But, I can't help, it smells like a workaround.
Most C coding standards and style guides make that requirement
- not because the C compiler sees it as ambiguous, but because humans
often do. (Or they misinterpret it.)
Yes, true. (We had that in our standards, too.)
So here you finally acknowledge there may be ambiguity from a human perspective.
But when I try to make that very point, it's me [...]
On 11/12/2024 16:24, Janis Papanagnou wrote:
[...]
Concerning "C"; I don't see why you shouldn't name a "positive"
conditioned control construct consistently(!) as 'while' whether
it controls a loop entry or a loop exit, and to name a "negative"
conditioned control construct consistently(!) as 'until'.
I sometimes use 'when' for positive rather than 'if', and 'until' or
'unless' for negative.
(Or just use a 'not' prefix for negative, but for this purposes, it's
more helpful it having low precedence rather than high, to avoid
parentheses around the whole thing.)
[...] (here I'm talking about my syntax);
[snip]
On 12/12/2024 14:03, Janis Papanagnou wrote:
On 11.12.2024 16:03, David Brown wrote:
On 11/12/2024 06:37, Waldek Hebisch wrote:
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
Who was that?
What I find documented in the GNU Awk package was this:
_The Tiny C Compiler, 'tcc'_
This compiler is _very_ fast, but it produces only mediocre code.
It is capable of compiling 'gawk', and it does so well enough that
'make check' runs without errors.
However, in the past the quality has varied, and the maintainer has
had problems with it. He recommends using it for regular
development, where fast compiles are important, but rebuilding with
GCC before doing any commits, in case 'tcc' has missed
something.(1)
[...]
(1) This bit the maintainer once.
That doesn't quite sound like the GNU Awk folks would think it's a good
tool or anything even close ("mediocre code", "well enough", "runs
without errors", "quality has varied", "had problems with it") And that
it's obviously not trustworthy given the suggestion: "rebuilding with
GCC before doing any commits".
This sounds like you imposing your own interpretion, and trying to
downplay the credibility of TCC.
And I cannot find any statement that "he likes compile speed", he just
stated that it is very fast (which seems to just have astonished him).
This looks like the original source:
[link snipped]
This is what it said just before:
[...]
On 12.12.2024 15:37, bart wrote:
On 12/12/2024 14:03, Janis Papanagnou wrote:
On 11.12.2024 16:03, David Brown wrote:
On 11/12/2024 06:37, Waldek Hebisch wrote:
Concerning tcc, they have explicit endorsment from gawk developer:
he likes compile speed and says that gawk compiles fine using tcc.
Who was that?
What I find documented in the GNU Awk package was this:
_The Tiny C Compiler, 'tcc'_
This compiler is _very_ fast, but it produces only mediocre code. >>> It is capable of compiling 'gawk', and it does so well enough that >>> 'make check' runs without errors.
However, in the past the quality has varied, and the maintainer has >>> had problems with it. He recommends using it for regular
development, where fast compiles are important, but rebuilding with >>> GCC before doing any commits, in case 'tcc' has missed
something.(1)
[...]
(1) This bit the maintainer once.
That doesn't quite sound like the GNU Awk folks would think it's a good
tool or anything even close ("mediocre code", "well enough", "runs
without errors", "quality has varied", "had problems with it") And that
it's obviously not trustworthy given the suggestion: "rebuilding with
GCC before doing any commits".
This sounds like you imposing your own interpretion, and trying to
downplay the credibility of TCC.
You don't think all these words are a clear indication? - The original
text you see above is almost just a concatenation of all these negative connoted words. It really doesn't need any own words or interpretation.
Aren't those original words, experiences, and suggestions clear to you?
(I have neither a reason nor an agenda to downplay any compiler.
On 11/12/2024 23:26, Michael S wrote:
On Wed, 11 Dec 2024 21:35:06 +0100I don't think there would be much reason to generate Thumb2 code unless
David Brown <david.brown@hesbynett.no> wrote:
it was for running on microcontrollers.
So it would be a lot of work
for a compiler developer for little purpose. I don't know if tcc
targeted 32-bit ARM or 64-bit ARM. If it is the former, then Thumb2
would be less effort to implement since it is mostly a different
encoding of the same instruction set - but it would be of little use
since (if I understand you correctly) 32-bit Cortex-A devices don't
support Thumb2. And if tcc supports 64-bit ARM, then the Thumb2
generation would be much more work since it is a significantly different
ISA. And again, how many people actually want Thumb2 binaries for their 64-bit Cortex-A devices?
However, you would not expect binary compatibility between code
generated for a 32-bit Cortex-M device and a Cortex-A platform, even
if it supports Thumb2 instructions - you have major differences in
the ABI, memory layouts, and core features beyond the basic registers.
There are major difference in floating-point parts of the ABI and in
everything related to interrupts. But for integer, it looks like the
T32 ABI is the same.
I'm sure much of it is the same (after all, it is solving the same basic problem), but the details are critical to making things work.
On 12.12.2024 05:38, James Kuyper wrote:
On 12/10/24 20:32, Scott Lurndal wrote:
[...]
You're confusing name spaces and namespaces.
It became quite obvious that you both were talking at cross purposes.
Personally I'd colloquially take both those terms as the same thing,
but I'm not a native speaker.
On 12/12/24 00:07, Janis Papanagnou wrote:
On 12.12.2024 05:38, James Kuyper wrote:
On 12/10/24 20:32, Scott Lurndal wrote:
[...]
You're confusing name spaces and namespaces.
It became quite obvious that you both were talking at cross purposes.
Personally I'd colloquially take both those terms as the same thing,
but I'm not a native speaker.
They are not.
The C standard does not define the term "name space", relying instead
upon the general CS definition of the term. However, it has a whole
section (6.2.3) devoted to listing and explaining the name spaces that
are relevant to the C language.
In C++, "namespace" is a both a keyword (listed in 5.11p3) and a piece
of terminology (defined in 9.8p1) for the feature enabled by that
keyword. It is clear from those descriptions that a C++ namespace is significantly different thing from a C name space.
Unlike the C standard, the C++ standard doesn't even bother explaining
name spaces. It makes only two uses of that term on it's own behalf, in connection with statement labels and macro names - neither usage has any plausible connection with a namespace. There are several occurrences of
"name space" in the section describing the differences between C and
C++, which make it clear that both standards are using the same meaning
for "name space", but that the two languages have a different number of
name spaces, with different contents.
On 12/12/2024 15:20, Janis Papanagnou wrote:
[...]
You don't think all these words are a clear indication? - The original
text you see above is almost just a concatenation of all these negative
connoted words. It really doesn't need any own words or interpretation.
That's the point: you've extracted only the negative words to give a misleading picture.
How about highlighting these as well:
[...]
(I have neither a reason nor an agenda to downplay any compiler.
Yet, you clearly are downplaying it.
On 12.12.2024 19:17, bart wrote:
On 12/12/2024 15:20, Janis Papanagnou wrote:
[...]
You don't think all these words are a clear indication? - The original
text you see above is almost just a concatenation of all these negative
connoted words. It really doesn't need any own words or interpretation.
That's the point: you've extracted only the negative words to give a
misleading picture.
If you "concatenate" the significant words and glue them together
with words to create an English sentence, and both are quite the
same, what do expect.
This compiler is , but it produces .
It is of compiling 'gawk', and it does so that
'make check' runs .
However, in the past the , and the maintainer has
with it. He recommends using it for regular
development, where fast compiles are , but ,
in case 'tcc' has .
I've extracted the words that carry the quality semantics, and the
result is completely useless.
Extracting the key-attributes and key-characteristics, and the
author's valuations drawn from experiences, was done by me for the
readers convenience - or for (lowbrow?) people who are not willing
to see or identify those attributes in the text "hidden" between
all those "meaningless" glue-words.
Again, just for your argument you make up nonsensical imputations.
Why - don't - you - stop - that - ?
How about highlighting these as well:
(I already told you that I'm *not* *interested* in advocating any
specific compiler, neither tcc nor gcc or anything else. So I will
not play your game. - Why don't you make your cat fight with folks
who are strong proponents or opponents of such tools as you are!)
I have noted that you have a strong personal affinity to that tool;
but I don't care. (If anything, I'm astonished about your fanatism.)
What I did care about was; about whom Waldek spoke when formulating
"explicit endorsement from gawk developer" - I asked "Who was that?"
Because I was surprised by his statement
and curious where he got
that idea from. Since the statement I found gave a fairly different
picture. YMMV. - And since I know Arnold - the head of the GNU Awk maintainers - from various public and private conversations, Waldek's interpretation (and yours, of course) irritated me, to say the least.
My guess is that Waldek had no other source of information, that he
read (or mis-read, as one likes) exactly the text I quoted, but I'm
not sure. (Only he can clarify that. Not you, Bart.)
[...]
(I have neither a reason nor an agenda to downplay any compiler.
Yet, you clearly are downplaying it.
I am not interested in "compiler wars".
problem to accept that,
On 30/11/2024 00:44, Keith Thompson wrote:...
David apparently has a different definition of "totally different types"
than you do. Since the standard doesn't define that phrase, I suggest
not wasting time arguing about it.
"int", "void" and "double" are totally different types in my view.
"int", "pointer to int", "array of int", "function returning int" all
have a relation that means I would not describe them as /totally/
different types - though I would obviously still call them /different/
types.
The syntax of C allows one declaration statement to declare multiple identifiers of types related in this way - it does not allow declaration
of types of /totally/ different types.
On 12.12.2024 20:39, James Kuyper wrote:...
On 12/12/24 00:07, Janis Papanagnou wrote:
Personally I'd colloquially take both those terms as the same thing,
but I'm not a native speaker.
They are not.
The C standard does not define the term "name space", relying instead
upon the general CS definition of the term. However, it has a whole
section (6.2.3) devoted to listing and explaining the name spaces that
are relevant to the C language.
In C++, "namespace" is a both a keyword (listed in 5.11p3) and a piece
of terminology (defined in 9.8p1) for the feature enabled by that
keyword. It is clear from those descriptions that a C++ namespace is
significantly different thing from a C name space.
Unlike the C standard, the C++ standard doesn't even bother explaining
name spaces. It makes only two uses of that term on it's own behalf, in
connection with statement labels and macro names - neither usage has any
plausible connection with a namespace. There are several occurrences of
"name space" in the section describing the differences between C and
C++, which make it clear that both standards are using the same meaning
for "name space", but that the two languages have a different number of
name spaces, with different contents.
You seem to be speaking about terms in different standards while
I spoke about "colloquially", the meaning of an expression named
"name space", or "name-space", or "namespace" (whatever is the
correct writing in English would be; note my hint about native
speakers). - So, now, have we two been speaking cross-purpose?
Note that Scott in his post wrote
In C++ they reside in the ordinary name space only if they're not
part of a named namespace.
which I interpreted as
In C++ they reside in the ordinary "name space" only if they're not
part of a named 'namespace'.
where "name space" would be the [colloquial] name of the concept
behind the symbol 'namespace'.
My interpretation of his post was that he wanted to differentiate
the default 'namespace' from the "named 'namespace'". (And that
the three types of "Name Spaces" (that you had in mind) were not
his concern with his remark.)
On 12/1/24 06:34, David Brown wrote:
On 30/11/2024 00:44, Keith Thompson wrote:...
David apparently has a different definition of "totally different types" >>> than you do. Since the standard doesn't define that phrase, I suggest
not wasting time arguing about it.
"int", "void" and "double" are totally different types in my view.
"int", "pointer to int", "array of int", "function returning int" all
have a relation that means I would not describe them as /totally/
different types - though I would obviously still call them /different/
types.
The syntax of C allows one declaration statement to declare multiple
identifiers of types related in this way - it does not allow declaration
of types of /totally/ different types.
There's a rule I sometimes find useful, when trying to choose a precise definition for a poorly defined term: figure out what statements you'd
like to say using the term, then define it in such a way as to guarantee
that those statements are correct.
In C, a declaration may contain an init-declarator-list, preceded by declaration-specifiers and optionally by an attribute-specifer-sequence (6.7p1). Each of the declarators in the list share the
declaration-specifiers and the attribute-specifier-sequence (6.7p7). Any syntax that's part of a declarator applies to that declarator's identifier.
Therefore, your statement suggests that two types should be considered "totally different types" if they are incompatible in either the declaration-specifiers or the attribute-specifier-sequence. With that definition, 6.7p7 in the standard would guarantee the truth of your
statement above.
Does that definition sound suitable?
On 12/12/24 15:27, Janis Papanagnou wrote:
[...]
Note that Scott in his post wrote
In C++ they reside in the ordinary name space only if they're not
part of a named namespace.
which I interpreted as
In C++ they reside in the ordinary "name space" only if they're not
part of a named 'namespace'.
where "name space" would be the [colloquial] name of the concept
behind the symbol 'namespace'.
My interpretation of his post was that he wanted to differentiate
the default 'namespace' from the "named 'namespace'". (And that
the three types of "Name Spaces" (that you had in mind) were not
his concern with his remark.)
That is a true statement about namespaces, and completely irrelevant to anything I was trying to say about name spaces.
He said that in response to my comment, which was supposed to be:
"In C struct, union, and enumeration tags have their own name space, in
C++ they reside in the ordinary name space."
[...]
On 13.12.2024 02:31, James Kuyper wrote:
On 12/12/24 15:27, Janis Papanagnou wrote:
[...]
Note that Scott in his post wrote
In C++ they reside in the ordinary name space only if they're not
part of a named namespace.
which I interpreted as
In C++ they reside in the ordinary "name space" only if they're not
part of a named 'namespace'.
where "name space" would be the [colloquial] name of the concept
behind the symbol 'namespace'.
My interpretation of his post was that he wanted to differentiate
the default 'namespace' from the "named 'namespace'". (And that
the three types of "Name Spaces" (that you had in mind) were not
his concern with his remark.)
That is a true statement about namespaces, and completely irrelevant to
anything I was trying to say about name spaces.
He said that in response to my comment, which was supposed to be:
"In C struct, union, and enumeration tags have their own name space, in
C++ they reside in the ordinary name space."
I understand that initially you might have not used the most
accurate [standards-] term; that was not my point. I merely
wanted to point out that [as to my reading] Scott wanted to
make a different clarification (IMO a nit-pick, if you want);
to extend your explanation (of ordinary unnamed] 'namespace's
by mentioning named 'namespace's). Whether you wanted to say
"name space" or whether you meant 'namespace'. - That was all;
no deeper thought, no principal disagreement. :-)
(But it's also possible I misinterpreted Scott's intention.)
What I did care about was; about whom Waldek spoke when formulating
"explicit endorsement from gawk developer" - I asked "Who was that?"
Because I was surprised by his statement and curious where he got
that idea from. Since the statement I found gave a fairly different
picture. YMMV. - And since I know Arnold - the head of the GNU Awk maintainers - from various public and private conversations, Waldek's interpretation (and yours, of course) irritated me, to say the least.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
What I did care about was; about whom Waldek spoke when formulating
"explicit endorsement from gawk developer" - I asked "Who was that?"
Because I was surprised by his statement and curious where he got
that idea from. Since the statement I found gave a fairly different
picture. YMMV. - And since I know Arnold - the head of the GNU Awk
maintainers - from various public and private conversations, Waldek's
interpretation (and yours, of course) irritated me, to say the least.
You can ask Arnold what he meant. I saw reasonably recent post
by him implying the he is still using tcc. And a seqence of
posts from 2013, where he reported problem with tcc and later
wrote that new version (containing fixes) works to compile
gawk. His messages indicated that he cared about compile
speed and considerd tcc to be fast.
From message about to tinycc-devel dated 'Sun, 06 Jan 2013':
: It is quite fast, which is a significant pleasure compared to
: gcc or clang.
There he reports problems, later message confirms that changes
to tcc fixed the them.
Let me summarize facts as I see them:
- he used tcc to develop gawk
- he said that tcc can be used to compile gawk
- he complained about speed of gcc/clang and noted that tcc is fast.
If that is not an endorsement, than what is?
Concerning statement that you found, "mediocre code", I think
this is about speed of generated code. If I would write
about tcc I would use different words to make is clearer,
but this is fair warning for people who want to install
and use gawk. For me testing using gcc before commit sounds
like common sense. Still, the snippet says "He recommends
using it for regular development". He could say "I do not
recommend using tcc", or "use at your own risk".
Concerning fairly different picture, I do not know what
"in the past the quality has varied" means. If that means
that there are reccuring troubles after 2013 fixes, than
this somewhat spoils the picture. If that is about what was
before 2013, then this sounds like reasonble disclaimer.
Anyway, the point was about using 'tcc' for developement
and clearly Aharon Robbins was serious about using 'tcc'
for developing 'gawk'. He reported problems to 'tcc'
developers and AFAICS the problems got fixed (Googler finds
messages from 2013 but no later reports of troubles).
On 13/12/2024 15:20, Waldek Hebisch wrote:
[...]
[...]
(As an aside, who would ever compile gawk, other than the developers and people building *nix distributions - who would all be using gcc or clang
for the job? [...]
On 13.12.2024 02:31, James Kuyper wrote:...
He said that in response to my comment, which was supposed to be:
"In C struct, union, and enumeration tags have their own name space, in
C++ they reside in the ordinary name space."
I understand that initially you might have not used the most
accurate [standards-] term; that was not my point. I merely
wanted to point out that [as to my reading] Scott wanted to
make a different clarification (IMO a nit-pick, if you want);
to extend your explanation (of ordinary unnamed] 'namespace's
by mentioning named 'namespace's).
On 13.12.2024 17:29, David Brown wrote:
On 13/12/2024 15:20, Waldek Hebisch wrote:
[...]
Since all I wanted is to know where you got your impression from,
Waldek, my question is (yet) fully answered. - Thanks.
I'm also glad that David had already thoroughly replied to your
post and I'm alleviated from that burden. As got obvious he read
(and interpreted) the quoted statement in exactly the same way as
I did. - Thanks as well.
[...]
(As an aside, who would ever compile gawk, other than the developers and
people building *nix distributions - who would all be using gcc or clang
for the job? [...]
I'm a bit puzzled about that statement. - I certainly compile the
GNU Awk source from the tar-file whenever I get my hands on a new
version (including beta-test versions).
I haven't looked into the package makefile but I suppose it will
use the "cc" that is standard on my system (which is a 'gcc').
(It requires literally not more than a nett minute (to download &
unpack & configure & make & install) on my old and rusty Linux box.)
Janis
On 13/12/2024 15:20, Waldek Hebisch wrote:
- he complained about speed of gcc/clang and noted that tcc is fast.
He said that tcc is "quite fast", faster than gcc or clang, and that he
liked that it was fast. That is not the same thing as a direct
complaint about the speed of gcc or clang
- though clearly he would have
been happier if those compilers had been faster. (And we all would be happier if they were faster - even those of us who find gcc fast enough
for our needs.)
If that is not an endorsement, than what is?
It is saying that tcc is a tool you can use to compile gawk, and praise
of its speed relative to gcc and clang. An endorsement would be saying
that it is the compiler he likes to use or recommends using.
On 13/12/2024 18:26, Janis Papanagnou wrote:
On 13.12.2024 17:29, David Brown wrote:
(As an aside, who would ever compile gawk, other than the developers and >>> people building *nix distributions - who would all be using gcc or clang >>> for the job? [...]
I'm a bit puzzled about that statement. - I certainly compile the
GNU Awk source from the tar-file whenever I get my hands on a new
version (including beta-test versions).
Okay, I guess that answers my question - there /are/ some people who
compile it from source on a regular basis.
But I suspect that for a tool like gawk, that would be rare - for most
people who use gawk, it usually makes little difference if they use a
version from last month or last century, so they will use whatever their distribution provides.
(I guess people who use source-based Linux
distros will compile it on their own machine, but they too will
generally use gcc or clang for that.)
You are clearly far more of a gawk user than I am - do you think many
people actively (as distinct from, say, a general update on a
source-based Linux distro) download the source and compile it themselves?
On 13/12/2024 01:59, James Kuyper wrote:...
On 12/1/24 06:34, David Brown wrote:
"int", "void" and "double" are totally different types in my view.
"int", "pointer to int", "array of int", "function returning int" all
have a relation that means I would not describe them as /totally/
different types - though I would obviously still call them /different/
types.
The syntax of C allows one declaration statement to declare multiple
identifiers of types related in this way - it does not allow declaration >>> of types of /totally/ different types.
There's a rule I sometimes find useful, when trying to choose a precise
definition for a poorly defined term: figure out what statements you'd
like to say using the term, then define it in such a way as to guarantee
that those statements are correct.
In C, a declaration may contain an init-declarator-list, preceded by
declaration-specifiers and optionally by an attribute-specifer-sequence
(6.7p1). Each of the declarators in the list share the
declaration-specifiers and the attribute-specifier-sequence (6.7p7). Any
syntax that's part of a declarator applies to that declarator's
identifier.
Therefore, your statement suggests that two types should be considered
"totally different types" if they are incompatible in either the
declaration-specifiers or the attribute-specifier-sequence. With that
definition, 6.7p7 in the standard would guarantee the truth of your
statement above.
Does that definition sound suitable?
That definition sounds correct, yes, but also completely useless. It
leads directly to a tautology - you can declare two things in the same declaration in C if you are allowed to declare them in the same
declaration in C.
On 13/12/2024 15:20, Waldek Hebisch wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
What I did care about was; about whom Waldek spoke when formulating
"explicit endorsement from gawk developer" - I asked "Who was that?"
Because I was surprised by his statement and curious where he got
that idea from. Since the statement I found gave a fairly different
picture. YMMV. - And since I know Arnold - the head of the GNU Awk
maintainers - from various public and private conversations, Waldek's
interpretation (and yours, of course) irritated me, to say the least.
You can ask Arnold what he meant. I saw reasonably recent post
by him implying the he is still using tcc. And a seqence of
posts from 2013, where he reported problem with tcc and later
wrote that new version (containing fixes) works to compile
gawk. His messages indicated that he cared about compile
speed and considerd tcc to be fast.
From message about to tinycc-devel dated 'Sun, 06 Jan 2013':
: It is quite fast, which is a significant pleasure compared to
: gcc or clang.
There he reports problems, later message confirms that changes
to tcc fixed the them.
Let me summarize facts as I see them:
- he used tcc to develop gawk
I don't see that from what you have written here. Perhaps it is true,
but unless I have missed something, you haven't given evidence of that.
Note that "used tcc /while/ developing gawk" is not at all the same
thing as "used tcc /to/ develop gawk". The tools you use to develop something are your main tools that help you produce good, working code.
gcc is the tool he uses for get correct code - tcc is merely for quick turnaround testing.
(Again, let me stress that it is possible that he
did use tcc as a main compiler during development, but I don't think you
have shown that.)
- he said that tcc can be used to compile gawk
Yes.
- he complained about speed of gcc/clang and noted that tcc is fast.
He said that tcc is "quite fast", faster than gcc or clang, and that he
liked that it was fast. That is not the same thing as a direct
complaint about the speed of gcc or clang
- though clearly he would have
been happier if those compilers had been faster. (And we all would be happier if they were faster - even those of us who find gcc fast enough
for our needs.)
If that is not an endorsement, than what is?
It is saying that tcc is a tool you can use to compile gawk, and praise
of its speed relative to gcc and clang. An endorsement would be saying
that it is the compiler he likes to use or recommends using.
If I write a program and say it can run on Linux or Windows, that is not
an endorsement for Windows.
David Brown <david.brown@hesbynett.no> wrote:
On 13/12/2024 15:20, Waldek Hebisch wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
What I did care about was; about whom Waldek spoke when formulating
"explicit endorsement from gawk developer" - I asked "Who was that?"
Because I was surprised by his statement and curious where he got
that idea from. Since the statement I found gave a fairly different
picture. YMMV. - And since I know Arnold - the head of the GNU Awk
maintainers - from various public and private conversations, Waldek's
interpretation (and yours, of course) irritated me, to say the least.
You can ask Arnold what he meant. I saw reasonably recent post
by him implying the he is still using tcc. And a seqence of
posts from 2013, where he reported problem with tcc and later
wrote that new version (containing fixes) works to compile
gawk. His messages indicated that he cared about compile
speed and considerd tcc to be fast.
From message about to tinycc-devel dated 'Sun, 06 Jan 2013':
: It is quite fast, which is a significant pleasure compared to
: gcc or clang.
There he reports problems, later message confirms that changes
to tcc fixed the them.
Let me summarize facts as I see them:
- he used tcc to develop gawk
I don't see that from what you have written here. Perhaps it is true,
but unless I have missed something, you haven't given evidence of that.
Note that "used tcc /while/ developing gawk" is not at all the same
thing as "used tcc /to/ develop gawk". The tools you use to develop
something are your main tools that help you produce good, working code.
gcc is the tool he uses for get correct code - tcc is merely for quick
turnaround testing.
I do not get what distinction you want to make. If somebody listens
to music while coding, then "while" looks appropriate. I develop
a project intending to support compilation by compiler A, B, C, D, E
(that is 5 different compilers), most developement happens using
compiler A, but there are pieces of code specific to other compilers,
which where developed using appropriate compiler. From time to time
I run test to verify that using each compiler produces expected
results. It is fair to say that I do small part of developement
using other compilers, but I would say that I use them _to_ develop
the project. It does not matter that they are not my main compiler.
Concerning statements about tcc, blurb that Janis quoted literally
says 'using tcc for developement'. And my reading of the whole
blurb is that he was 'using tcc for developement' (there is more
statements on the net, but I do not think it makes sense for me
to dig them, this should be enough). I am not a native English
speaker, but for me 'using X for developement of Y' and 'using X
to develop Y' are synonyms, with second being shorter than the
first, while in first skipping Y is more natural.
(Again, let me stress that it is possible that he
did use tcc as a main compiler during development, but I don't think you
have shown that.)
- he said that tcc can be used to compile gawk
Yes.
- he complained about speed of gcc/clang and noted that tcc is fast.
He said that tcc is "quite fast", faster than gcc or clang, and that he
liked that it was fast. That is not the same thing as a direct
complaint about the speed of gcc or clang
Janis snipped that part containing the complaint. Janis claimed that complaint does not mean that he liked speed of 'tcc', that is why
I specifically quoted part where he said the he liked the speed.
- though clearly he would have
been happier if those compilers had been faster. (And we all would be
happier if they were faster - even those of us who find gcc fast enough
for our needs.)
If that is not an endorsement, than what is?
It is saying that tcc is a tool you can use to compile gawk, and praise
of its speed relative to gcc and clang. An endorsement would be saying
that it is the compiler he likes to use or recommends using.
If I write a program and say it can run on Linux or Windows, that is not
an endorsement for Windows.
I would take is as an endorsement for "running the program on Windows".
Anyway, you wrote:
: No actual developer would care if their code can be compiled by your
: little toy compiler, or even more complete little tools like tcc.
Gawk developers disprove "no actual developer would care if their
code can be compiled by tcc". A little grep search discoverd few tcc-specific lines added to configure machinery. Google search
discoverd more projects, I did not try to find how many other
project cared and how much, but clearly taken literally
'no actual developer' is false (which you admited in another
message). It is possible that 'majority of actual developers'
do not care about tcc, I have no hard data, but my feeling is that
developers caring abut tcc are not so small minority.
For me this is quite different from "no actual developer".
Even in non-literal reading IMO "no actual developer"
exagerates smallness of this group.
On 14/12/2024 04:36, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 13/12/2024 15:20, Waldek Hebisch wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
Personally I've also found some problems with TCC in building some
programs, eg. missing bits in headers, or sometimes some mysterious error.
I run TCC in order to compare compilation speed, and to that end I will sometimes tweak things (eg. I used my version of windows.h, or compiled preprocessed C in order to avoid a mire of conditional blocks which try
and which determine which compile is running, and get it wrong).
Once a program has compiled, I don't remember bugs in TCC's code; it's
just a bit slower.
(BTW when I tried to build TCC from source using gcc-O3, I found the resulting compiler even faster than the pre-build binaries! I found that
a bit scary so I test only with binaries as supplied.)
OK, if it's so simple, explain it to me.
Apparently the first line here needs a semicolon after }, the second
doesn't:
int X[1] = {0};
void Y() {}
Similarly here:
if (x) y;
if (x) {}
Why?
"Because that's what the grammar says" isn't a valid answer.
Bart <bc@freeuk.com> wrote:
OK, if it's so simple, explain it to me.
Apparently the first line here needs a semicolon after }, the second
doesn't:
int X[1] = {0};
void Y() {}
Similarly here:
if (x) y;
if (x) {}
Why?
"Because that's what the grammar says" isn't a valid answer.
In first approximation, for some constructs it is clear where
the end is, so no need for semicolon, for some other semicolon
is needed to mark the end. This rule does no work in all cases,
for example 'continue' and 'break' are keywords and defining
corresponding statements without semicolon would lead to
no ambiguity. Similarly in case of 'goto'. 'return' is
more tricky, but is seems that one could define it without
semicolon. But here probably consistency won: 'break',
'contunue', 'goto' statement, 'return' statement are perceived
as simple statements and are covered by informal rule
"simple statement needs terminationg semicolon". '{}'
is a compound statement, hence not a simple statement.
Similarly, 'if' is a complex statement, and needs no semicolon
on its own. As you noted without terminating semicolon do-while
loop would be ambigious, so it needs semicolon. I do
not have example for declarations, but I suspect that defining
then without semicolon would be ambigious. Function
_definition_ is anambigious without terminating semicolon,
so why put is there.
Concerning "grammar says it", grammar for C90 from which one
can generate working parser has 74 nonterminals. You could
change some rules and still get working parser for a different
language. So in this sense part of grammar is purely
arbitrary. But other changes would lead to grammar that
fails to work. If you look at rules you will see
substantial similarities between some rules, so grammar
is really simpler than what size alone would would suggest.
So, having working and sane (that is relatively simple)
grammar puts restrictions on the language, some changes
simply do not fit. Some changes would lead to completely
different language, that was not an option for C, as
very first versions were intentionaly similar to earlier
languages and later there was a body of existing programs
and programmers.
You write about confusion. I think that what you present
grammarians would call "garden paths", that is perceiving/
trying to make up different rules than grammar rules. In
your 'if' example you ignore simple thing: 'if' needs no
terminator on its own.
It is null statement that needs
terminating semicolon, and "empty" compound statement that
does not need a terminator. Null statement and compound
statement are quite different, in particular without
semicolon you would not know that null statement is there,
while compound statement can be easily recognized without
need for terminator. There is no special rule for "if with
null statement", if it were you would get needlessly complex
grammar.
In the first pair, one line is a declaration, another is
function definition. Again, quite different constructs,
one needing terminator, other not needing it.
Garden paths are common in natural language and people
cope quite well. So for normal (even beginer)
programmers garden paths are not a real problem: you get
confused once, learn the right way and go on. In many
cases learning is unconcious, you simply get used to
the way code is written, and when you make mistakes
compiler tells you that there is an error, so you
correct it.
bart <bc@freeuk.com> writes:
On 14/12/2024 20:17, Waldek Hebisch wrote:[...]
Bart <bc@freeuk.com> wrote:
For the same reason that f(1 2 3) might be unambiguous, as usually
it's written f(1, 2, 3).
Actually, since C likes to declare/define lists of things where other
languages don't allow them, such as:
typedef int T, U, *V;
struct tag {int x;} a, b, c;
enum {red, green} blue, yellow;
I couldn't quite see why you can't do the same with functions:
int F(void){return 1;}, G(void){return 2;}
Those are function *definitions*. Declarations, including function declarations, can be bundled :
int F(void), G(void); // not suggesting this is good style
Definitions that are not declarations, such as function definitions,
cannot.
I thought you liked consistency.
If the language allowed function definitions to be bundled, you could
write something like :
int F(void) {
// 30 lines of code
}, G(void) {
// 50 lines of code
}
Is that what you want?
[...]
Now that you mention it, why not? In:
if (c) s1; else s2;
s1 and s2 are statements, but so is the whole if-else construct; why
doesn't that need its own terminator?
Because if it did, you'd need multiple semicolons at the end of nested statements. Is that what you want?
You seem to be pretending that there's some principle that all
statements should be terminated by semicolons. There is no such
principle, and there are multiple kinds of statements that don't
require a trailing semicolon.
This is legal in C: {}{}{}{}.
As I said, I wouldn't be able to explain it.
I could explain it to you, but I can't understand it for you.
[...]
Identifying the end of /some/ statements shouldn't mean not needing
terminators in those cases. It needs to be a consistent rule.
Why? C's grammar is unambiguous. Given reasonable code layout, most C programmers don't have any problems determining where statements end.
With your "consistent rule", if you had 5 nested statements (if, for,
while, etc.), you'd have to terminate the entire construct at least 5 semicolons.
Is that really what you want?
[...]
But the exact rules remain fuzzy.
The exact rules are not fuzzy. They're unambiguous, and they work much better than you're willing to acknowledge.
[discussion of your personal language snipped]
bart <bc@freeuk.com> writes:
On 14/12/2024 22:22, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 14/12/2024 20:17, Waldek Hebisch wrote:[...]
Bart <bc@freeuk.com> wrote:
For the same reason that f(1 2 3) might be unambiguous, as usuallyThose are function *definitions*. Declarations, including function
it's written f(1, 2, 3).
Actually, since C likes to declare/define lists of things where other
languages don't allow them, such as:
typedef int T, U, *V;
struct tag {int x;} a, b, c;
enum {red, green} blue, yellow;
I couldn't quite see why you can't do the same with functions:
int F(void){return 1;}, G(void){return 2;}
declarations, can be bundled :
int F(void), G(void); // not suggesting this is good style
Definitions that are not declarations, such as function definitions,
cannot.
I thought you liked consistency.
Consistency would mean function definitions could be 'bundled' too.
Declarations can be bundled. Function definitions cannot.
More generally, some things can be bundled, and other things cannot.
The existing rules are consistent.
What else do you think should be able to be bundled? Macro definitions? Include directives?
#include <stdio.h>, <stddef.h>; // ???
C is not 100% consistent and orthogonal. It was never intended to be.
Every here knows that it isn't. If you want Lisp, whose syntax rules
are simpler than C's, you know where to find it.
Google "foolish consistency".
[...]If the language allowed function definitions to be bundled
I don't care about it.
Excellent. Then let's drop it.
On 12.12.2024 06:35, Keith Thompson wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]
For (yet another) example; my K&R shows a syntax for expressions like
expression := binary
binary := binary + binary
binary := binary * binary
That's odd. Is that an exact quotation?
No, not exact, I abbreviated it; omitted about 25 other operators and
used another syntax (no ' | '). (Trust me that it hasn't things that
I called 'factor' and 'term' in my post, which is equivalent to what
you have formulated below in your copy.) - I'm using a translation of something that someone classified as being a "pre-second" edition but
not quite the first edition. Two references point to 1977 (Prentice
Hall) and 1978 (Bell Labs). - The text for the "binary" syntax has two optional informal columns, the first one has the comment "precedence"
for some of the variants of "binary" operators. (But it's also just
titled as "Syntax in Short"; probably presented in a form to make it
easy to understand without overloading it for purpose of a textbook.)
It serves the purpose to explain an ambiguous syntax with non-codified precedence and a separate precedence table (but it's not an exact "C"
syntax description as you'd probably find it in standards documents).
On 12/12/24 06:38, Janis Papanagnou wrote:
[...]
My copy of K&R 1st edition has nothing remotely resembling that.
[...]
On 18.12.2024 22:04, James Kuyper wrote:
On 12/12/24 06:38, Janis Papanagnou wrote:
[...]
My copy of K&R 1st edition has nothing remotely resembling that.
I think someone here already mentioned that the English version
looks different compared to the translation I've in my bookshelf.
(So I am not sure what your post is actually intending.[*])> (But that's anyway unimportant to the point I made; that you can
define semantical information like the precedence separately or
by syntax.[**])
Janis
PS: FYI; Some of your posts, James, arrive also in my mailbox.
[*] I've put (if you're interested) a scan of that page uploaded
here: http://volatile.gridbug.de/KR_syntax-rotated90.pdf
(But similar syntaxes for expressions can be found also in other
programming languages' contexts; see below.)
On 12/12/24 06:38, Janis Papanagnou wrote:
On 12.12.2024 06:35, Keith Thompson wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]
For (yet another) example; my K&R shows a syntax for expressions like
expression := binary
binary := binary + binary
binary := binary * binary
That's odd. Is that an exact quotation?
No, not exact, I abbreviated it; omitted about 25 other operators and
used another syntax (no ' | '). (Trust me that it hasn't things that
I called 'factor' and 'term' in my post, which is equivalent to what
you have formulated below in your copy.) - I'm using a translation of
something that someone classified as being a "pre-second" edition but
not quite the first edition. Two references point to 1977 (Prentice
Hall) and 1978 (Bell Labs). - The text for the "binary" syntax has two
optional informal columns, the first one has the comment "precedence"
for some of the variants of "binary" operators. (But it's also just
titled as "Syntax in Short"; probably presented in a form to make it
easy to understand without overloading it for purpose of a textbook.)
It serves the purpose to explain an ambiguous syntax with non-codified
precedence and a separate precedence table (but it's not an exact "C"
syntax description as you'd probably find it in standards documents).
My copy of K&R 1st edition has nothing remotely resembling that. I
cannot find "binary" as an element of the grammar anywhere. There are 16 grammar rules for "expression". The one that comes closest is
"expression binop expression", where binop is one of C's 19 binary
operators, divided into 12 different priority levels.
On Wed, 11 Dec 2024 21:19:54 +0000
bart <bc@freeuk.com> wrote:
This also comes up with 'while (cond); {...}'.
$ cat foo.c
void foo(int x)
{
while (x--);
bar();
}
$ clang-format < foo.c
void foo(int x) {
while (x--)
;
bar();
}
Do I use clamg-format myself? No, I don't.
And I don't know why C formatters are generally less popular among C programmers then, for example, go formater among Go programmers or Rust formatter among Rust programmers.