Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 43 |
Nodes: | 6 (0 / 6) |
Uptime: | 94:24:45 |
Calls: | 290 |
Calls today: | 1 |
Files: | 904 |
Messages: | 76,378 |
Impressive! A PCIe NVMe drive will be a boost, but don't expect
too much, when you already have so much RAM. And electric power. ;-)
My experiments with parallel threads were a bit sobering. You
really need rather isolated subprocesses that require little
synchronisation.
Otherwise the slowest process plus additional
syncing costs can eat up all the expected benefits. Nothing new.
On Sun, 18 Aug 2024 9:28:09 +0000, minforth wrote:
Impressive! A PCIe NVMe drive will be a boost, but don't expect
too much, when you already have so much RAM. And electric power. ;-)
I tried a RAM drive (from AMD), but it has a throughput of only 50MB/s,
10x slower than the SATA 6GBs connected Samsung SSD (500MB/s). I am a
bit puzzled why that is so devastatingly slow.
My experiments with parallel threads were a bit sobering. You
really need rather isolated subprocesses that require little
synchronisation.
Yes, that is Amdahl's law. We constantly struggled with that
for tForth. Fine-grained parallelism never gave us good results.
Otherwise the slowest process plus additional
syncing costs can eat up all the expected benefits. Nothing new.
A new (to me) thing was that processes slow down enormously from
accessing shared global variables (depending on their physical
location), even when no locks are needed/used. For iSPICE such
variables are in OS managed shared memory (aka the swap file)
and are used very infrequently.
-marcel
What I meant is severe slowdown when reading variables that are
physically *close* to variables that belong to another process.
It happens for both AMD and Intel on both Windows and Linux.
Spacing such variables farther apart has dramatic impact but
is quite inconvenient in most cases.
I don't recall that transputers had these problems.
What I meant is severe slowdown when reading variables that are
physically *close* to variables that belong to another process.
It happens for both AMD and Intel on both Windows and Linux.
Spacing such variables farther apart has dramatic impact but
is quite inconvenient in most cases.
In article <2df471d1ec39c22949169f8a612b780d@www.novabbs.com>,[..]
mhx <mhx@iae.nl> wrote:
A new (to me) thing was that processes slow down enormously from
accessing shared global variables (depending on their physical
location), even when no locks are needed/used. For iSPICE such
variables are in OS managed shared memory (aka the swap file)
and are used very infrequently.
That agrees with my experience. Parallel processes work with the
same image. The protocol is that one process write to a shared variable,
the other reads. The last process signals the chain that it is
ready. All processes are busy waiting on the signal to stop and to
pass it down the chain.
That was on linux with AMD.
Was your experience MS with Intel?
mhx@iae.nl (mhx) writes:
What I meant is severe slowdown when reading variables that are
physically *close* to variables that belong to another process.
Yes, but if you want performance, you have to rearrange your data to
avoid false sharing.
Do you know if shared memory as provided by the OS (or Windows)
has these problems too?
Impressive! A PCIe NVMe drive will be a boost, but don't expect
too much, when you already have so much RAM. And electric power. ;-)
I didn't catch your drift there until I found out why there are no
really fast RAM drives. The fastest drive is no drive at all, and
that is possible by writing the simulation data to a temp file.
Windows has a special attribute for that ( _O_SHORT_LIVED ) and
Linux has shm.