It is currently Thu Mar 28, 2024 2:26 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 80 posts ]  Go to page Previous  1, 2, 3, 4
Author Message
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #61 Posted: Wed Apr 22, 2020 9:35 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Hm, maybe it's not just about rounding afterall, but the prob mass of draws (achievable with perfect komi) also matters? So the hypothesis about class bounds is true for integer komi, but for half point komi it depends on where the initial subpoint balance (error margins before next worse rounding) is?

The particular worst case seems when initial fractionals extremely favor B, it is very hard to draw with W (0.001-0.999 balance). So basically only perfect play can draw with W vs perfect play. This doesn't break the Elo bound with perfect integer komi because now B can draw easily, even if relatively weaker.

But if you add half point komi in this scenario (or the rule that W wins ties), making the game a theoretical W win, this only affects perfect play (vs perfect play), and pushes him several subpoint classes ahead [-1], for example. This is because the other 0.999 balance mass is now lost, teared out of the prob space. It doesn't help weaker players, has no effect as drawing is useless with B.

Or am I hallucinating? Are those subpoint classes real or imaginary? They perform increasingly better against perfect play, catching more and more of those (now) very hard W draws. But maybe this only manifests vs perfect play (local chain again)? Can they also perform increasingly more classes better against weaker opponents as well?

Edit: Maybe what happens here is the hard-to-achieve W draw forms a new very small "smallest scoring unit" (without the B 0.999 part of a point)?


Last edited by moha on Thu Apr 23, 2020 1:05 pm, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #62 Posted: Thu Apr 23, 2020 2:30 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
lightvector wrote:
moha wrote:
lightvector wrote:
The game is exactly the same except now the grid is on all the integers (...-2,-1,0,1,2,...) and the game starts at 0.499999 (so with perfect play with both players always flipping zeros on cards, the game is a draw).
Thanks! I'm not sure how you meant your last comment? It seems variants A and B collapse immediately as it is now possible to beat perfect play, even for nonperfect players (and the game is NOT always draw even between perfect players). Variant C remains.

Sure, just focus on C then, ignore A and B.

moha wrote:
But can they demonstrate their class advantage/differences over each other against the -1 pt player?


Oh, I think I see what you're getting at, and why you've been insistent on adding draws. Yes, you're right about this objection. I hadn't considered enough the difference between draws and drawless games, thanks for pushing on this detail. :tmbup:


I remember being surprised in my undergraduate seminar on psychological research that, when comparing two treatments or conditions, you ignore results that show no difference. I was accustomed to thinking of rewarding a draw as ½ pt. On the basis of that reasoning, which I think is sound, rating systems should ignore draws. For chilled go a 0 score is a win for the player who got the last play.

But back to my main point. Really, draws should not count for Elo ratings.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #63 Posted: Thu Apr 23, 2020 2:46 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
lightvector wrote:
maybe by some chance Go with half-integer komi could have some partial element of this - or not, it's hard to tell
But there IS actual, meaningful rounding in go. As I wrote earlier, there is no real half point komi in integer games. Chinese with 7.5 komi is actually komi 7 with W winning ties. What happens here is that we play the game, THEN the score gets rounded (with ties retaining their prob mass), THEN we add the komi (which is integer), THEN we decide to treat final draws as W wins. Order matters.


On the question of half point komi, let me take the opportunity to plug button go, where a player can take the button, worth ½ pt. by area scoring. The main effect of the button is that it does not matter who gets the last dame. :)

As for rounding, I am not sure what moha means. The only rounding I am aware of in go is what David Wolfe explained to me long ago, that a fractional score in chilled go gets rounded up or down to a territory integer score, depending on who has the move. (Ignoring ko complications, OC. ;))

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #64 Posted: Thu Apr 23, 2020 3:11 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Rounding came up in different contexts, the general sense that since board results are integer, subpoint mistakes will disappear - one way or another. So the resolution of performance measurement in a game is a whole point ("smallest scoring unit").

But if draws are not treated as draws, there may be two new smaller granulated performance units (draw with W and draw with B - these are, at least vs perfect play, narrower than a point), and these may affect class bounds (which are related to the smallest unit).

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #65 Posted: Thu Apr 23, 2020 4:20 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
Rounding came up in different contexts, the general sense that since board results are integer, subpoint mistakes will disappear - one way or another. So the resolution of performance measurement in a game is a whole point ("smallest scoring unit").


OK, thanks. :)

BTW, that's one reason that I like button go, because, like chilling, it takes account of such tiny errors. And why I suggested chilled go for nearly perfect play. It seems to me that, since correct play in chilled go is also correct play in territory go and also in area go, a perfect player should be able to play a perfect game of chilled go. :) OC, in the rules you have to account for possible ko complications.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #66 Posted: Thu Apr 23, 2020 10:08 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Bill Spight wrote:
since correct play in chilled go is also correct play in territory go and also in area go
What did I misunderstand then? I mean here:
moha wrote:
If I understood Bill correctly a chilled score of 6.8 could be seen as better than 6.7 (and it actually is if we stop chilled), but two chilled scores of 6.6 are the same. But since the rounding direction will matter for territory (an insanely lot at these levels), chilled 6.6 with W to move is different to chilled 6.6 with B to move - exactly what CGT wanted to avoid

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #67 Posted: Thu Apr 23, 2020 10:51 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
Bill Spight wrote:
since correct play in chilled go is also correct play in territory go and also in area go
What did I misunderstand then? I mean here:
moha wrote:
If I understood Bill correctly a chilled score of 6.8 could be seen as better than 6.7 (and it actually is if we stop chilled), but two chilled scores of 6.6 are the same. But since the rounding direction will matter for territory (an insanely lot at these levels), chilled 6.6 with W to move is different to chilled 6.6 with B to move - exactly what CGT wanted to avoid


I don't know if you misunderstood anything, except that I was not intending to distinguish between identical chilled scores depending upon who had the move, except for who gets the last play in case of a 0 result after adjusting for komi. Or not, if you want to have ties. :)

Anyway, if the chilled komi in the 19x19 is 7, and the board score is 7, that means that the territory board score is also 7, which normally means that there are an even number of dame, and so White got the last play and White wins. It is theoretically possible that a perfect player might be able to engineer a seki such that Black would win, as lightvector was speculating at one point, I think.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #68 Posted: Thu Apr 23, 2020 11:16 am 
Dies with sente

Posts: 79
Liked others: 4
Was liked: 28
Rank: 2 kyu
GD Posts: 109
Universal go server handle: EricBackus
Bill Spight wrote:
On the basis of that reasoning, which I think is sound, rating systems should ignore draws. ... Really, draws should not count for Elo ratings.

I'm getting off track from the main discussion, but I don't understand these remarks.

If all you want to know is which of two players is better than the other, clearly draws can be ignored. But if you want some understanding of how far apart in ability two players are, it seems like draws provide some information that would be better used than ignored.

For example, if two players play 100 games, and get a draw on 99 of them, the winner of the non-drawn game is more likely the stronger player. But the 99 draws give some indication that these players are relatively close in ability. Compare with two players playing 100 games and one player wins 99 of them, which gives some indication that the winning player is relatively much stronger than the other.


This post by EricBackus was liked by 2 people: gennan, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #69 Posted: Thu Apr 23, 2020 12:34 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
EricBackus wrote:
Bill Spight wrote:
On the basis of that reasoning, which I think is sound, rating systems should ignore draws. ... Really, draws should not count for Elo ratings.

I'm getting off track from the main discussion, but I don't understand these remarks.

If all you want to know is which of two players is better than the other, clearly draws can be ignored. But if you want some understanding of how far apart in ability two players are, it seems like draws provide some information that would be better used than ignored.

For example, if two players play 100 games, and get a draw on 99 of them, the winner of the non-drawn game is more likely the stronger player. But the 99 draws give some indication that these players are relatively close in ability. Compare with two players playing 100 games and one player wins 99 of them, which gives some indication that the winning player is relatively much stronger than the other.


I think that the proper comparison is between equivalent results. Suppose that two players play 100 games, presumably 50 with each player going first, and they draw 99 games and player A wins 1; or the 100 game match ends with one draw, while player A wins 50 games and player B wins 49 games. Does one draw mean that they are more closely matched than 99 draws? Or perhaps it is the other way around?

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #70 Posted: Thu Apr 23, 2020 12:51 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
I think what matters is how hard it is to draw with either color, and how much harder it is to win than to draw. Cf this with the earlier problems of the distorted chess example (metrics, distances between results, 0.5+0.5<>1?) and the potentially differing reward for B and W draws below.

One problem with ignoring draws is you cannot measure performance vs perfect play (which may exist in practice). Two almost perfect players (like 1-2 pts away from it) would be seen as performing equally poorly - even though in reality they didn't, you just ignored the evidence.

Button go: Unlike the similar drawless/C example which rounds away from 0 so logically doesn't, button go - depending on it's initial error margin balance - can "round" small losses to wins (-0.1 directly to +). So perfect play is beatable even when playing nonperfectly. In this case this rounding size seems to be the limiting factor for Elo (the "smallest unit" - the margin within which you need to be to perfect play for class differences to reduce to/around 1). And consequently, what allows performance to be measured even vs perfect play.

With half point komi / W wins ties, the rounding is done to integer first on the board (so small losses at most rounded to draw), then only in a separate big swing, all draws are (may) treated as W wins. This can matter for metrics/distances reasons like above.

With perfect integer komi, we don't know if the initial error margins favor one side or not. So with W this margin to perfect play (class-1 boundary) may be closer than 0.5 while with B may be farther. But since W draws = B draws, this will in any case average to class-1 = [-0.5]. But if W and B draws are rewarded differently, the smallest unit of distance may become the smaller of the two cases without averaging.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #71 Posted: Thu Apr 23, 2020 2:20 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
I think what matters is how hard it is to draw with either color, and how much harder it is to win than to draw. Cf this with the earlier problems of the distorted chess example (metrics, distances between results, 0.5+0.5<>1?) and the potentially differing reward for B and W draws below.

One problem with ignoring draws is you cannot measure performance vs perfect play (which may exist in practice). Two almost perfect players (like 1-2 pts away from it) would be seen as performing equally poorly - even though in reality they didn't, you just ignored the evidence.


Well, you can argue the other way around. Draws hide the difference between players, so it is accounting for them rather than ignoring them that makes you think that they are performing equally poorly.

Quote:
Button go: Unlike the similar drawless/C example which rounds away from 0 so logically doesn't, button go - depending on it's initial error margin balance - can "round" small losses to wins (-0.1 directly to +).


How? It depends, I suppose, on what you mean by a fractional error. I can define it for chilled go as the difference between the final score and the komi, whatever that is. (It doesn't actually need to be an integer for chilled go, since the chilled go scores are not necessarily integers. They are rational fractions, so you could avoid draws with an irrational komi. ;)) With no kos it is impossible to "round" the chilled go result more than to the nearest integer. You can't change a small negative fraction to a positive result, only to 0 or -1. Now, if you define a fractional error differently, I don't know what can happen.

There is another kind of rounding between territory scoring and area scoring, where an even territory score is typically "rounded" to the next higher area score for Black, leading to a greater difference between area scores. What the button normally does is to simply add ½ pt. to territory scores, with no rounding at all. Without the button the usual effect of this rounding is to turn a territory score of +6 to +7, which would be a zero score after a 7 pt. komi is subtracted. With the button the score of +6 becomes +6½ which becomes -½ after subtracting the 7 pt. komi. The loss stays a loss. Likewise, a 7 pt. territory score normally becomes +½ after subtracting komi, and stays a win. In effect, the button subtracts ½ pt. from an integer territory komi. (Since the 7½ pt. komi seems to favor White, that might be a good thing, I dunno.)

With all this rounding, what can cause a loss at one level to become a win by yielding a larger difference between scores at the next lower level is ko. The button does not stop that from happening.

Quote:
So perfect play is beatable even when playing nonperfectly.


Not with the definition of fractional errors in terms of chilled go scores. Maybe with a different definition.

Quote:
In this case this rounding size seems to be the limiting factor for Elo (the "smallest unit" - the margin within which you need to be to perfect play for class differences to reduce to/around 1).


So don't round. I.e., use chilled go for potentially a countable infinity of classes of play. If that's what you want, OC.

Quote:
And consequently, what allows performance to be measured even vs perfect play.


No comprende. I thought that not rounding accounted better for small differences in play. :scratch:

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #72 Posted: Thu Apr 23, 2020 3:04 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Bill Spight wrote:
Draws hide the difference between players, so it is accounting for them rather than ignoring them that makes you think that they are performing equally poorly.
I probably guess what you mean but it doesn't seem to apply here: vs perfect play draws (their frequency) are your ONLY source of information.

Quote:
Quote:
button go - depending on it's initial error margin balance - can "round" small losses to wins (-0.1 directly to +).
How? It depends, I suppose, on what you mean by a fractional error.
Quote:
So perfect play is beatable even when playing nonperfectly.
Not with the definition of fractional errors in terms of chilled go scores. Maybe with a different definition.
Yes I didn't mean in strict CGT terms. So far it was assumed that there (may) exist minor mistakes or inaccuracies in various sense. I suppose button go (with integer komi) is a theoretical win for B or W, and I didn't see a reason for the winning player's initial advantage (however measured) to completely disappear on the smallest mistake (instead of some margin). By small minus I meant negative performance (from the optimum) not theoretical score - but OC then the rounding is done to 0 (perfect play) not positive. My bad. Rounding scores from negative to 0 on the board, then to positive/win with half point komi may be possible in non-button go though.

Button go vs perfect play: what if the optimal play and win involves rounding up a certain chilled score, but the player chooses a line where he rounds up a bit smaller chilled score instead? IIRC it was possible that the button is not the last play.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #73 Posted: Thu Apr 23, 2020 5:20 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
Bill Spight wrote:
Draws hide the difference between players, so it is accounting for them rather than ignoring them that makes you think that they are performing equally poorly.
I probably guess what you mean but it doesn't seem to apply here: vs perfect play draws (their frequency) are your ONLY source of information.


Do you mean that perfect play never loses? That may be so in chess, but I don't think that it is a general rule.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #74 Posted: Thu Apr 23, 2020 5:22 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
Button go vs perfect play: what if the optimal play and win involves rounding up a certain chilled score, but the player chooses a line where he rounds up a bit smaller chilled score instead? IIRC it was possible that the button is not the last play.


Yes, with certain kos taking the button will not be the last play. Kos can invalidate rounding, button or no.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #75 Posted: Sat Apr 25, 2020 1:23 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
This came back to me half-asleep this morning, and some of my remaining doubts were resolved.

It seems best to consider games with W and B as two different games played alternately, and always use the weaker player's view. If performances would be on continuous scale, normals like usual, their difference is another normal with negative mean now. Sd usually relatively small (reason), except when BOTH players are weak. Here are some typical cases (performance differences, uniform y scale for simplicity).

Now either we use integer komi, or W wins draws (0.5 komi). And either the game is roughly balanced, same difficulty for both players (0.5-0.5 pts initial error margin balance) or is significantly harder to play / draw for a side (like 0.1-0.9 pts balance, ie. one of W or B can only afford 0.1 pts worth of subpoint mistakes before letting his draw slip, the other has 0.9). The raw theoretical result is always draw (komi).

Without discrete units these performance differences could be getting arbitrarily narrow. But the easiest point earning result puts a marker (the minimum needed for not losing) on the X-axis on the above plot. Anything to its right is treated as equally good (rounded up). So (loosely speaking) once the distribution has it's peak near the marker (and we are close enough to perfect play, the ultimate opponent), we are good enough to score enough in half cases with this color and not to be outclassed by anybody anymore.

The marker is at the relevant half of the initial error margin balance (which in turn sums up to the smallest scoring unit). It is normally negative (tiny errors are allowed and still draw) with the two colors averaging to -0.5 (the 2 classes per point). But if draws are treated losses (thus even perfect play "lose" some to nonperfect play) the marker is positive for a color, which case can practically be missing (unreachable vs significantly better opponent). So we can end up with the remaining case, either the smaller or larger half point of the draw balance (more or less than 2 classes per point) depending on which side draws are awarded to.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #76 Posted: Tue Apr 28, 2020 10:56 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
Let me clear up some confusion I may have caused about chilled go and rounding. I was a bit confused, myself. :oops:

Click Here To Show Diagram Code
[go]$$ Chilled go -1¼ pts.
$$ ------------------
$$ . . X . . . O . .
$$ . . X O O O O . .
$$ . . X X X X O . .[/go]


Suppose that the rest of the board is settled and all of these stones are unconditionally alive.

-1¼ is not only the local chilled go score, it is the estimate of both the territory and area local scores.

OC, play stops in chilled go, but may continue under territory or area scoring. Black to play can round the local score up to -1, White to play can round the local score to -2 by territory scoring, -3 by area scoring.

Click Here To Show Diagram Code
[go]$$ Black rounds up
$$ ------------------
$$ . . X 1 2 . O . .
$$ . . X O O O O . .
$$ . . X X X X O . .[/go]


Click Here To Show Diagram Code
[go]$$W White rounds down
$$ ------------------
$$ . . X 1 . . O . .
$$ . . X O O O O . .
$$ . . X X X X O . .[/go]


Now let's look at a fractionally worse position.

Click Here To Show Diagram Code
[go]$$ Chilled go -1½ pts.
$$ ------------------
$$ . . X . . O . O .
$$ . . X O O O O O .
$$ . X X X X X X O .[/go]


Again, -1½ is the local chilled go score and the estimate for both the territory and area local scores.

Click Here To Show Diagram Code
[go]$$B Black rounds up
$$ ------------------
$$ . . X B . O . O .
$$ . . X O O O O O .
$$ . X X X X X X O .[/go]


:b1: rounds up to a local territory score of -1, which is also the estimate of the local area score, which depends on who gets the dame.

Click Here To Show Diagram Code
[go]$$W White rounds down
$$ ------------------
$$ . . X 1 . O . O .
$$ . . X O O O O O .
$$ . X X X X X X O .[/go]


:w1: rounds down to a local territory score of -2, and local area score of -3.

It is asserted that, without ko complications, correct play under chilled go is also correct under territory scoring (except with special rules, such as not counting territory in seki) and under area scoring.

But let's suppose that Black at some point, assuming correct play thereafter, has the option of playing to the first diagram or to the second diagram, with the territory score being the same elsewhere on the board. However, the first option will round down to -2, while the second option will round up to -1, which means that the second option is better by territory and area scoring, despite having a lower score by chilled go. If play by chilled go is correct by territory and area scoring, how can that be?

What I overlooked in the previous discussion was the difference between the local chilled go score and the global score. If the -1¼ pt. position rounds down, that means that it is White's turn to play, and, since Black plays first in go, Black has made an extra play. Under chilled go each play costs 1 point and so globally we subtract 1 pt. from the territory estimate to get the chilled go score. So, instead of being ¼ pt. better for Black than Diagram 2, Diagram 1 is ¾ pt. worse, given the penalty for the extra Black play elsewhere on the board.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #77 Posted: Fri May 08, 2020 12:52 am 
Dies with sente

Posts: 113
Liked others: 11
Was liked: 27
Rank: 1d
Universal go server handle: iopq
mhlepore wrote:
To me, Silver's claim seems reasonable (at least to the point of the game being solved).

To use a sports analogy, occasionally weaker players can beat stronger players. Why?
--Human athletes make blunders.
--Balls rattle off the rim in unlucky ways.
--There is a lot of natural noise in certain games (e.g., baseball), such that the best teams lose 1/3 of the time.

The strong Go programs, to me anyway, are not likely to be impacted by this variability. They are not likely to misread a life/death problem, or a ladder, or miscount. And once you knock out a lot of the mistakes, it becomes in essence about backward induction, and the more powerful system should pretty much always win.

That said, I wonder if Silver would still back the statement, however replacing 100-0 with 1,000,000-0.


First of all, they are really bad at large semeais. This is because there's no code that simplifies the search in the case of different orders, I think they don't even have transpositions. So if there are 20 liberties for each group, but no escape or life, there's a chance a computer tenukis and turns a seki into certain death.

In the case of ladders, KataGo has code for them, but it may not play out a ladder that doesn't work - even if this wins the game, like in the ladder game. This is because the code only says "this ladder fails" which may bias against searching it.

In terms of endgame, it is helpless in the examples that can be solved by endgame theory. There are 20 different moves, all seem to be around 1 point, but only a few correct orderings exist.

But actually, the reason why the next bot beats the previous bot is actually very different. It will play Mi Yuting's flying dagger slightly better, getting a small advantage that it turns into a win. The next bot will find another variation that is even better, etc.

I've already had this happen on 9x9, each generation beats the previous one by 65%, two generations ago by 75%, etc.

Yet against KataGo all of them are around 40% - my own bot can beat the previous generation more convincingly than KG, but it can't beat KG.

So while 100-0 multiple times will happen, it will be against itself. Against an opponent that doesn't play the same joseki the improvement will be slow. This is why self-play data for ELO is extremely inflated.


This post by iopq was liked by 3 people: Bill Spight, Harleqin, sorin
Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #78 Posted: Fri Nov 26, 2021 3:20 pm 
Beginner

Posts: 5
Liked others: 0
Was liked: 0
How about this perspective: the problem will remain computational until the game has been mapped.

1. think of a 4x4 board - or some other reasonable minimum (keeping this vague on purpose) - is there a complete tree? Might be more complex than tic tac toe, which I have read has been mapped because stones may be removed but this just needs more computational power.
2. if such a minimum reasonable board game is mappable, is there any board size threshold beyond which things are no longer so? If not, the game is finite.
3. before mapping is done, refining algorithms makes perfect sense, and I agree that handicap is not what David Silver probably referred to. he spoke of percentage of wins instead. Can't see why a sudden inflation in the depth of predictions, probably enabled by computing power mainly might not, theoretically, enable 100-0 wins per each generation. Engaging algorithms in developing new blueprints for 'cracking' go seems science fiction, but if a blueprint for THIS should be 'cracked' at some point, sky is the limit - and that's the last thing we need to do.
4. as and when mapping is completed, the rest would be 'sampling' - not 'synthesis', but that gets off-topic.

I think DS may have meant that progress will be immense and there is plenty of depth left to explore. But for such a spectacular explosion of progress (100-0's), I think humans need to be out of the design table. Just sit back and try to make sense :)

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #79 Posted: Sat Dec 18, 2021 2:43 am 
Dies with sente

Posts: 113
Liked others: 11
Was liked: 27
Rank: 1d
Universal go server handle: iopq
TOTAL wrote:
How about this perspective: the problem will remain computational until the game has been mapped.

1. think of a 4x4 board - or some other reasonable minimum (keeping this vague on purpose) - is there a complete tree? Might be more complex than tic tac toe, which I have read has been mapped because stones may be removed but this just needs more computational power.
2. if such a minimum reasonable board game is mappable, is there any board size threshold beyond which things are no longer so? If not, the game is finite.
3. before mapping is done, refining algorithms makes perfect sense, and I agree that handicap is not what David Silver probably referred to. he spoke of percentage of wins instead. Can't see why a sudden inflation in the depth of predictions, probably enabled by computing power mainly might not, theoretically, enable 100-0 wins per each generation. Engaging algorithms in developing new blueprints for 'cracking' go seems science fiction, but if a blueprint for THIS should be 'cracked' at some point, sky is the limit - and that's the last thing we need to do.
4. as and when mapping is completed, the rest would be 'sampling' - not 'synthesis', but that gets off-topic.

I think DS may have meant that progress will be immense and there is plenty of depth left to explore. But for such a spectacular explosion of progress (100-0's), I think humans need to be out of the design table. Just sit back and try to make sense :)



There might be an opening that forces all of the game to go in a certain way - so an engine that knows how to play perfectly from an empty board despite not knowing how to solve difficult situations that happen on arbitrary boards

So after mapping out the first X moves the engine might be strong enough not to lose to perfect play given Y visits per move

Top
 Profile  
 
Offline
 Post subject: Re: "Indefinite improvement" for AlphaZero-like engines
Post #80 Posted: Sat Dec 18, 2021 7:14 am 
Lives in sente

Posts: 1037
Liked others: 0
Was liked: 180
TOTAL wrote:
How about this perspective: the problem will remain computational until the game has been mapped.

2. if such a minimum reasonable board game is mappable, is there any board size threshold beyond which things are no longer so? If not, the game is finite.



Except finite vs infinite not really what is at stake. The question is how does the task of mapping go up with the increase in board size. If "m" is the amount of computation required for a mapping and "b" the board size we have.....

m = f(b) where "f" is some function and the question is what sort of function. In other words, how fast foes "M" grow as "b" grows.

Now suppose "f" is of the sort bb. In other words, m=abb Well "m" is still finite for any finite values of "a" and "b" but it is increasing VERY fast as "b" increases.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 80 posts ]  Go to page Previous  1, 2, 3, 4

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group