For example, would a rollout with 5184 trialsThere's a subtle distinction between "precision" and "accuracy."
at xg-roller level be as reliable as a rollout
with 2592 tials at xg-roller+ level and as a
rollout with 1296 trials at xg-roller++ level?
...
Accuracy is another matter. Murat of all people should understand
that "what the bot thinks the correct play is" is not necessarily
the same as "the correct play"; indeed, in some positions, it is
debatable what "the correct play" is since that can depend on who
your opponent is, what their emotional state is at the time, etc.
But even setting those things aside, suppose for the sake of
argument that we define "the correct play" as what game theorists
would call an (expectiminimax) "equilibrium" play. We can ask whether >stronger settings are more likely to yield the correct play. The
answer is that we can't ever be completely sure, but one can give
heuristic arguments in support of this principle. For example,
equilibrium play has a certain self-consistency property, so you
can "cross-examine" the bot and see its answers are self-consistent. >Experience suggests that stronger settings exhibit greater
self-consistency. Bob Wachtel's book "In the Game Until the End"
has some examples of this. But again, the arguments are only
heuristic, and we certainly can't be completely sure in any
particular instance that stronger settings are giving us more
"accurate" answers.
Related:The title "Man beats machine at Go in human
| Man beats machine at Go in human victory over AI
| ... <https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/>
On 12/13/2023 4:47 AM, MK wrote:The distinction is more than subtle, especially
For example, would a rollout with 5184 trialsThere's a subtle distinction between "precision"
at xg-roller level be as reliable as a rollout
with 2592 tials at xg-roller+ level and as a
rollout with 1296 trials at xg-roller++ level?
and "accuracy."
An "accurate" verdict is one that gives theThat's the loose definition. The strict definition
correct answer.
A "precise" estimate has very little statisticalYes, more trials reduce random errors ("noise")
noise. Increasing the number of trials increases
the precision.
If you have a lot of trials then you can be veryThis isn't necessarily true and indeed incomplete.
confident that you are learning "what the bot
really thinks" and that it is very unlikely to
change its mind even if you increase the number
of trials to infinity.
Accuracy is another matter. Murat of all peopleIt's good that you acknowledge/agree on these
should understand that "what the bot thinks the
correct play is" is not necessarily the same as
"the correct play"; indeed, in some positions, it is
debatable what "the correct play" is since that
can depend on who your opponent is, what their
emotional state is at the time, etc.
But even setting those things aside,Yes, let's focus on the more tangible...
suppose for the sake of argument that we defineI can only accept "correct play" based on empirical
"the correct play" as what game theorists would
call an (expectiminimax) "equilibrium" play.
We can ask whether stronger settings are moreI assume you mean look-ahead plies? Can you (or
likely to yield the correct play.
The answer is that we can't ever be completelyI won't argue against self-consistency if you can
sure, but one can give heuristic arguments in
support of this principle. For example, equilibrium
play has a certain self-consistency property,
so you can "cross-examine" the bot and see itsThis would be most interesting for me to see. Has
answers are self-consistent.
Experience suggests that stronger settings exhibitCan you give some examples here from the book
greater self-consistency. Bob Wachtel's book "In
the Game Until the End" has some examples of this.
But again, the arguments are only heuristic, andI argue that we can if we have unbiased bots that
we certainly can't be completely sure in any
particular instance that stronger settings are
giving us more "accurate" answers.
On December 13, 2023 at 7:27:56rC>AM UTC-7, Timothy Chow wrote:
If you have a lot of trials then you can be very
confident that you are learning "what the bot
really thinks" and that it is very unlikely to
change its mind even if you increase the number
of trials to infinity.
This isn't necessarily true and indeed incomplete.
While random errors decrease, systematic errors
may increase (accumulate and compound), thus
cause the bot to change its mind.
I assume you mean look-ahead plies? Can you (or
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?
I won't argue against self-consistency if you can
prove that your equilibrium play is actually that.
so you can "cross-examine" the bot and see its
answers are self-consistent.
This would be most interesting for me to see. Has
any bot been cross-examined for this and how?
But again, the arguments are only heuristic, and
we certainly can't be completely sure in any
particular instance that stronger settings are
giving us more "accurate" answers.
I argue that we can if we have unbiased bots that
are trained not only through cubeless, single-game
play but also through cubeful and "matchful" play,
eliminating extrapolated cubeful/matchful equities.
Still, it's quite interesting but unfortunately this
won't happen in gamblegammon anytime soon
because there is no dissenting bot and I am the
only dissenting human, who can't even lead the
horses to water, let alone make them drink... :(
MK
Still, it's quite interesting but unfortunately thisJust as I was about to follow up to my own post
won't happen in gamblegammon anytime soon
because there is no dissenting bot and I am the
only dissenting human, who can't even lead the
horses to water, let alone make them drink... :(
On 12/22/2023 12:18 PM, MK wrote:You would be right if the number of trials is infinite.
While random errors decrease, systematic errorsNo, this is not correct, at least when you are simply
may increase (accumulate and compound), thus
cause the bot to change its mind.
extending a specific rollout.
Systematic errors can indeed accumulate andSince you also reused "compound", now I am curious
compound over the course of a game,
but a rollout trial repeatedly samples an entire game,Okay, I agree.
so *each individual* trial is subject to the accumulated
systematic error.
There will be some randomness involved from trialI don't like the use of the word "luck" in this context
to trial, of course; some trials may be "lucky" enough
to avoid the variations that suffer from a lot of
accumulated systematic error, while other trials may
be "unlucky" enough to hit those variations,
but in the long run these fluctuations will even out,A rollout is a continuum. When you stop it after any given
and the rollout will converge. The final result will be
an average over all accumulated systematic errors.
The fact that the main protagonist here has a ridiculous interest in hawking fast cars around various US "strips" is irrelevant.
On 12/22/2023 12:18 PM, MK wrote:Let's see if they do. If not, we may ask this on their
I assume you mean look-ahead plies? Can you (orThe GNU team can answer this better than I can.
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?
One thing to note is that during rollouts, the botsI kind of knew this but didn't ponder much on it.
will apply some kind of move filter to screen out
unpromising plays. That is, if you perform a 3-ply
rollout, the bot doesn't necessarily evaluate every
legal move at 3-ply and pick the highest-scoring
one. It will evaluate all the options at the lowest
ply but then discard a lot of them as not likely to
emerge as the top play.
On 12/22/2023 12:18 PM, MK wrote:This sounds good to me. Let's archive it... ;)
I won't argue against self-consistency if you canThe *theoretical* equilibrium play is *defined* in
prove that your equilibrium play is actually that.
terms of a system of equations that expresses
self-consistency. If you insist on an empirical
definition, though, then self-consistency can't be
proved.
Even though I wouldn't limit it to "crazy superbackgame"I don't know if anyone has done this in a systematicso you can "cross-examine" the bot and see itsThis would be most interesting for me to see. Has
answers are self-consistent.
any bot been cross-examined for this and how?
fashion, but certainly, if you take some crazy
superbackgame or containment position, you can
observe inconsistency yourself. Note down the
3-ply equity (for example). Then run through all the
possible rolls, and note down their 3-ply equities.
Average them, and you'll find that they don't average
out to the original 3-ply equity. This means that the
3-ply equity isn't (entirely) self-consistent. In many
positions, the top play will still be the top play, but
in the crazy superbackgame positions, this experiment
can result in wild swings that drastically change the
top play.
On 12/22/2023 12:18 PM, MK wrote:a- We wouldn't need to train for every possible position.
I argue that we can if we have unbiased bots thatThere are certainly ways to improve the way bots
are trained not only through cubeless, single-game
play but also through cubeful and "matchful" play,
eliminating extrapolated cubeful/matchful equities.
are trained, but it will still be true that we won't be
*completely* sure that we're getting more accurate
answers in every position. That would require more
computing power than is available in the observable
universe.
Now that I do, my immediate reaction is that it
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?
On 12/22/2023 12:18 PM, MK wrote:
I assume you mean look-ahead plies? Can you (or
someone else) expand on this and explain/clarify
how plies work during play and during rollouts?
The GNU team can answer this better than I can. One thing to note
is that during rollouts, the bots will apply some kind of move
filter to screen out unpromising plays. That is, if you perform
a 3-ply rollout, the bot doesn't necessarily evaluate every legal
move at 3-ply and pick the highest-scoring one. It will evaluate
all the options at the lowest ply but then discard a lot of them
as not likely to emerge as the top play.
On 12/27/2023 2:16 AM, MK wrote:Ah, that magic number 21 again. :) The number
Now that I do, my immediate reaction is that itIt's done for speed. Each additional ply slows
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?
things down by a factor of (about) 21.
On December 27, 2023 at 5:22:06rC>AM UTC-7, Timothy Chow wrote:
On 12/27/2023 2:16 AM, MK wrote:
Now that I do, my immediate reaction is that it
sounds really bad. Shouldn't it be the other way
around? That is, evaluate at a higher ply first?
It's done for speed. Each additional ply slows
things down by a factor of (about) 21.
Ah, that magic number 21 again. :) The number
of possible dice rolls at every turn... ;)
But why the factor is imprecise, i.e. "about 21"?
Can't you give us the exact math...?
On 1/6/2024 7:50 PM, MK wrote:
On December 27, 2023 at 5:22:06rC>AM UTC-7, Timothy Chow wrote:
It's done for speed. Each additional ply slows
things down by a factor of (about) 21.
You mean like this one?:But why the factor is imprecise, i.e. "about 21"?The speed at which a complex piece of code
Can't you give us the exact math...?
runs depends on many factors beyond the simpleI have a feeling that it has something to do with the
math of how many different rolls there are.
On January 8, 2024 at 6:55:38rC>AM UTC-7, Timothy Chow wrote:
The speed at which a complex piece of code
You mean like this one?:
=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================
On 1/8/2024 1:14 PM, MK wrote:So? Why isn't it good enough for you??
On January 8, 2024 at 6:55:38rC>AM UTC-7, Timothy Chow wrote:That's pseudocode, not code.
The speed at which a complex piece of codeYou mean like this one?:
=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================
So, you should be able to explain the reason based
on the above pseudo code.
On 1/9/2024 1:28 AM, MK wrote:This is not it. Just like dice rolls even out (or can
So, you should be able to explain the reasonFinding the best move for a given roll isn't
based on the above pseudo code.
necessarily going to take the same amount
of time for every roll. To find the best move,
one must first generate all the legal moves
and evaluate them. The number of legal ways
to play 11 is not necessarily going to be
the same as the number of legal ways to
play 66. It will depend on the position.
This is not it. Just like dice rolls even out (or can
be forced to artificially even out faster), number
of legal ways to play for given dice rolls at given
positions will alse average out.
On 1/11/2024 4:17 AM, MK wrote:
This is not it. Just like dice rolls even out
(or can be forced to artificially even out
faster), number of legal ways to play for given
dice rolls at given positions will average out.
Of course. That's what "approximately" means.
Check your dictionary.
=======================================
GNU Backgammon Manual V1.00.0
10.4.5.4 n-ply Cubeful equities
..... so how so GNU Backgammon calculate cubeful
2-ply equities? The answer is: by simple recursion:
Equity=0
Loop over 21 dice rolls
Find best move for given roll
Equity = Equity + Evaluate n-1 ply equity for resulting position
End Loop
Equity = Equity/36
=======================================
Oh, I almost forgot. There is a kind of rotten easter
egg in the above pseudocode. Let's see how long it
will take for you whizzes to find it...? :)
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 59 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 19:29:28 |
| Calls: | 810 |
| Calls today: | 1 |
| Files: | 1,287 |
| D/L today: |
10 files (21,017K bytes) |
| Messages: | 194,231 |