Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 42 |
Nodes: | 6 (0 / 6) |
Uptime: | 02:04:40 |
Calls: | 220 |
Calls today: | 1 |
Files: | 824 |
Messages: | 121,544 |
Posted today: | 6 |
For work here, all functions that return data have a parameter that
limits the byte count. Usually pass a pointer to a buffer, with the
size parameter equal to or less than the buffer size. Some C library functions are just not safe to use in that respect.
Local spun string copy and other functions use the same sort of size limiting.
And OpenVMS and apps have not-particularly-reentrant and not-particularly-performant code that calls some system service to first
size the data and then allocate the buffer and then calls again to
return the data.
OpenVMS has ~no concept of languages, either. Yeah, the C abd C++ I18N giblets, Java and its own little world, maybe using the existing and
older ICU or maybe you ported a newer ICU, and the deprecated Terminal Fallback Facility (TFF) and National (Replacement) Character Set (NCS) giblets, sure. All of which make things more interesting for apps that
want or need to deal with the UTF-8 and post-ASCII world.
UTF-8 in file names, in usernames, in logicals, in identifiers and in programs/scripts: not really needed.
On Fri, 23 Aug 2024 19:34:18 -0400, Arne Vajhøj wrote:
UTF-8 in file names, in usernames, in logicals, in identifiers and in
programs/scripts: not really needed.
I would say these are needed.
On 8/23/2024 11:32 AM, Stephen Hoffman wrote:
OpenVMS has ~no concept of languages, either. Yeah, the C abd C++ I18N
giblets, Java and its own little world, maybe using the existing and
older ICU or maybe you ported a newer ICU, and the deprecated Terminal
Fallback Facility (TFF) and National (Replacement) Character Set
(NCS) giblets, sure. All of which make things more interesting for
apps that want or need to deal with the UTF-8 and post-ASCII world.
Regarding UTF-8 support, then my take is that:
UTF-8 in file names, in usernames, in logicals, in identifiers and in programs/scripts: not really needed.
UTF-8 in file content and in databases: very much needed.
And support for the latter fall in 3 groups:
* JVM languages (Java, Groovy etc.) and I believe Python - does
support unicode and can read/write using any encoding including UTF-8
* C, C++, PHP - developer keeps track of what encoding a byte
sequence is in but possible to explicit convert encodings
(C/C++ has wchar_t but it is neither much used nor UTF-8 friendly
AFAIK)
* the traditional native languages - very little support except what
can be done by calling C functions
On 8/23/2024 7:41 PM, Lawrence D'Oliveiro wrote:
On Fri, 23 Aug 2024 19:34:18 -0400, Arne Vajhøj wrote:
UTF-8 in file names, in usernames, in logicals, in identifiers and in
programs/scripts: not really needed.
I would say these are needed.
I can live without the ability to create:
blåbærsyltetøj.txt
:-)
Arne
On 24/08/2024 00:59, Arne Vajhøj wrote:
On 8/23/2024 7:41 PM, Lawrence D'Oliveiro wrote:
On Fri, 23 Aug 2024 19:34:18 -0400, Arne Vajhøj wrote:
UTF-8 in file names, in usernames, in logicals, in identifiers and in
programs/scripts: not really needed.
I would say these are needed.
I can live without the ability to create:
blåbærsyltetøj.txt
:-)
Is that Norwegian, or Swedish?
:-)
On 8/23/2024 7:41 PM, Lawrence D'Oliveiro wrote:
On Fri, 23 Aug 2024 19:34:18 -0400, Arne Vajhøj wrote:
UTF-8 in file names, in usernames, in logicals, in identifiers and in
programs/scripts: not really needed.
I would say these are needed.
I can live without the ability to create:
blåbærsyltetøj.txt
On Fri, 23 Aug 2024 20:10:44 -0500, Craig A. Berry wrote:
I'm told Unicode support in Python 2.x was pretty shaky but Python 3 is
a lot better.
That was the primary motivation behind the Python 2→3 transition.
Unicode was also the primary motivation behind PHP 5->6. But it got
cancelled and PHP did 5->7 without going unicode.
I'm told Unicode support in Python 2.x was pretty shaky but Python 3 is
a lot better.
UTF-8 in file names, in usernames, in logicals, in identifiers and in programs/scripts: not really needed.
On Fri, 23 Aug 2024 22:23:18 -0400, Arne Vajhøj wrote:
Unicode was also the primary motivation behind PHP 5->6. But it got cancelled and PHP did 5->7 without going unicode.
To be fair, that little peccadillo does get lost in the avalanche of
other reasons why PHP is bloody awful ...