On 2023-03-09 10:19, Terje Mathisen wrote:
This is much better!
Thank you!
I still think you should pay more attention to the cost of branches, i.e. a >> sequential scan of a shortish segment of the country array is very cheap, on
the order of 2-3 clock cycles/country, while a single mispredicted branch cost
8-20 cycles depending upon the CPU you are using.
An AMD FX8150 and an Intel i7-4710MQ.
Assuming this really is so important, why not "waste" a bit more lookup table
RAM for a full 2-char index?
It's Utterly Unimportant (with two huge "U"s), the program as-is runs in 0.32
seconds clock-time, but it's just interesting to see how far I can go in pure
Pascal, to translate that subsequently into in-line assembler, where I'm in many
cases harking back to Turbo Pascal 3 times, by coding post-Pentium instructions
as "db" sequences, the original code dates back to 1994, and pretty amazingly,
many of my original record structures are very AVX friendly!
Anyway, given that my ['A'..'Z'] array had one position open, I've used that to
replace the general last-matched country with a per-initial character last-matched country, and that had cut the chop-count even more:
Original full range-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a : 40,362
Low/high per initial character (pic): 14,907
As previous + first most used pic-a-a :-a 2,871
As previous + cache pic-a-a-a-a-a-a-a-a-a-a-a-a :-a 1,747
And now I'm going to think slightly outside the box, but will have to knock up a
bit of REXX to actually test this, by what might be counter-intuitive, upping
the low and/or high value of some ranges into the previous/next (or, unlikely,
but never say never, pre-previous or over-next) range, so that an initial first
lookup immediately finds the second most-used country, without generating an excessive number of additional lookups for even less used countries.
E.g. for countries starting with "L" I currently only have three, in order of
use, "LT", "LV" (74 lookups), and "L" (45 lookups), both requiring three chops
for a total of 357 chops. If I were to extend the "L" range into the "M" range
so that a first chop returns "LV" I save 148 chops, which means that even if I
need up to three more chops for "L", I still save time, and for this specific
case, extending the "Lx" range to "MEX" gives me single-chop hits on "LV" and
still the same three-chop hits on "L", so reducing the number of chops to 209.
In this specific case, I could go even further, and take the lo-range back into
the "K" countries and get to "L" in just two chops.
Doing the same for the also easy to try three additional (after "GB") "G*" countries saves me another 96 chops if I put "GR", the second most used, in the
initial middle of the range (by extending the high value to "IR"), without affecting "GE" and "GH"chop counts.
I'm going to see how this works out before doing anything else, as this change
would be easy to implement in Pascal and in my PL/I version of the same on z/OS,
I keep the two and they should give the same output, if they don't I know that
I've screwed up somewhere!
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 65 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 02:10:25 |
| Calls: | 862 |
| Files: | 1,311 |
| D/L today: |
10 files (20,373K bytes) |
| Messages: | 264,321 |