My concerns about Leela Zero current status and the way its

hydrogenpi7 · #1

Tl dr == Leela Zero has stalled. The dev (GCP) is stubborn and won't lower the gating and threatens to do things that will actually hurt the progress of the project. One week its "we will increase the amount of visits per move" and the next week its "we should lower the visits, etc", last month he himself proposed to lower the gating threshold from 55% to something lower, now he seems very adamant to be against it. He purports to have this do a "scientific experiement" but then goes about willing to almost accept a random network provided by some stranger on the Internet who had his friend train it... but then censors all talk about others trying to duplicate or replicate the work so that it can be proven to be legit. He said weeks and weeks ago that 10block days are done and we need to move to 20 block or something else larger but no action has been taken. Meanwhile we have stalled and gone almost weeks at a time without a new promotion but he is unable and unwilling to promote networks that are 54.94% (almost 55%) and basically happy to waste community computing time, all the while the number of clients are slowly but surely and steadily dropping... pretty soon at this rate this project will be dead in the water and unrecoverable by this time next month imho.

///

Long version:

https://github.com/gcp/leela-zero/issues/1113

https://user-images.githubusercontent.c ... aa9f63.png

I propose (as a parallel effort and side test) someone start a 40 block 256 filter net2net (or another 20 block 256 filter net2net) using the current 10 block net # 117 (b9639fe4) and backtraining the latest 1.5 Million games with many steps (games 4935001 to games 6435001 ) and backtrain those 1.5 million games and then as new official LZ games come out in the future (regardless of whether whatever network size it may be on/from by then) to continue training forwards with this newer/stronger net2net'd as opposed to the very old 6block #78#78, How to find Information about the new best network that the OP orginally trained from.

This will be tons stronger than any current 256/20 nets being tested and when the LZ official channell stalls, we can then at that point in time seemlessly switch over to this much more stronger and powerful and robust network and with the advent of multiugpu training support, we don't have to worry about the games training not catching up etc

edit update:

The issue at hand is very soon now the dev of LZ gcp will have to decide which network to up size to, be it a 15 block 192 filter, 20 block 128 filter or 20 block 256 filter. The days of staying on 10 block 128 filters are rapidly coming to a close.

Suffice it to say, the current 256f 20b contentders being tested were stemmed from a very old 6 block #78 that was net2net, and even so, they have not caught up with the current games, perpetually lagging about at least 1 million games behind.

As there is no multigpu support for the training code as of yet, and as the 3rd party guy who did the 256f 20b makeshift test net attested to having only 1x gtx 1080ti, he rightly stated that he is having trouble catching up with the new games and stated on github previously that he doesn't think he'll be able to catch up.

Once we switch over to the higher sized networks, assuming gcp decides to use this guys 256f 20b as the official net going forwards, it marks a splitting point in which it would be good to know if a brand new 20 block 256 filter could be bootstrapped via n2n with the last/final 10block (swa) and backtrained with the last 1.5 to 3 million games and then forward trained with the new games that are/will-be coming out in the future from here on out.

Since this process is time consuming and time intensive, as you yourself have stated that it will take several weeks, I'm not sure why you deem it not prudent for there to be a public announcement (whether by me or anyone else) that at least encouraging others whom may have the inclination or proclivitiy to undertake such an endeavor and iniative to take a headstart and jump on the parallel track as soon as possbile to make sure the increase the odds of success so that when/if the day the main channel stalls, if we have a side channel that was bootstrapped with much newer arch it could potentially have a higher cap/ceiling and thus would offer the LZ community a means to switch to something higher in order to protect and maintain the contiuitiy of progress.

It is important to think and plan strategically ahead, and ironically had this advice been taken, we wouldn't be in the position we are in right now, wondering how much stronger the 256/20 net wcould be if the training had been able to catch up with the latest 1 million or more so games.

Update 3:

In a consolidated response to the comments below: it is important to recall the proper context of how it all came about. About 11 days ago, a 3rd party dude randomly posted on github a 256f.20b network that he had trained. He alleges that it took him about 14 days to train the network, but that he used -- for whatever reasons -- a very old network (network #78) a 6-block that still had significant issues with ladders and large groups, etc. When he posted his results, everyone on github, including the very dev GCP himself were "very surprised" that it was "so strong", and this is esp. because GCP noted that the GCP himself had in the past tried multiple times to produce a larger net (of various different sizes and configurations) but was unable to produce any that was strong(er) and never able to get to the strength of the net that the 3rd party dude shared.

I think it is fair to say that if even the head honcho of Leela Zero, none other than the great GCP himself couldn't provide such a strong large net, that whatever the third party guy produced was either special, and/or unusual or uncommon. In addition to setting it in the queue for official matching, GCP also entertained the idea brought up by the dude about using his 256f.20b network as the OFFICIAL network for Leela Zero going forward! This was concerning and unusual for several reasons. First of which is that up to that point in time, GCP solely trained all the networks. Sure the community crowd computed the self playing games, but the actual networks were trained always on GCP's own computers and by his own hand. So it was certainly unprecedented for there to be a shift and change in policy to allow a random third party dude on the Internet to produce and share a network and then all of a sudden just because it is stronger, it gets adopted as the Leela Zero main line without question or pause? The ironic factor in all of this is that Leela Zero is in essence and spirit a scientific initiative to duplicate and reproduce the AlphaZero papers. Yet here was GCP ready to accept or at least he appeared to seriously entertain the act of incorporating a random internet guys network (self purported to be trained by very old LZ games from a net2net of a very old LZ net) as the mainline and to adopt it as the official LZ network going forward without question. When I brought up this reasonable concern, GCP merely facetiously retorted if I believed that the dude mixed pro games with LZ games and came up with something stronger, in essence ridiculing my prudent cautions for concern and silencing a critical line of thought. In reality, we do have hybrid mixed pro + LZ games networks that are far stronger than any current official LZ network, including for example the "Leela Master" network weights... so my point is that not only is something like this possible, not only is it likely that is it plausible, but in fact it has already happened!

Suffice it to say, after adjusted for "time", (on time parity) it was apparent that the 256f.20b network was actually slightly weaker than the then current official LZ net due to it being five times more slower or so... The only reason why GCP was hesitant to switch to the unofficial 256f.20b network is because after the time adjustment it was actually not stronger than the current LZ net. Part of the issue was that the LZ training code is NOT multiGPU, and the other reason was the guy who did the 256f.20b network only had one single 1xgtx1080ti, so even if the code was multiGPU it wouldn't really help in this instance. So he decided to continually train with more and more recent networks, but the other problem was it was running a "red queens race" in that more new games were being produced at a faster rate than he could even keep up! Even time he compared a new 256f.20b network against the then "new" LZ network, it was always behind, and it would be pereptually behind simply because the lag in a million games or so could not be closed in terms of the perpetual gap. So even though the new versions of 256f.20b network do get stronger after he trained newer games, because of the fact that the games he uses to train come from the LZ mainline self playing games, it means it is always going to be perpetualy behind and it will never catch up! Essentially the 256 network "tracks" the progress of the current LZ network, but lagging a million games behind and also the lag in strength remains unchanged over time! This is the catch-22 and viscious cycle and loop we are stuck in, because GCP is hesistant to switch until the 256f.20b network becomes stronger than current net on time parity!

This is precisely why I brought to the attention of the community with this regard to jump ahead in the timeline, to account for the long leadtime and to mitigate the possibility of something like this ever occuring again in the future.

To all those that say, "do it yourself because its open source", I would say that I was very open and transparent in that I never had any previous experience with training networks, and if even GCP could not make it work and had to rely on the help of this mysterious 3rd party guy, it begs the question why the community chastised me for thinking or for questioning that I wouldn't be able to produce the same level of results myself. Its called being honest and realistic. But just because I do not have the technical experience it does not mean I can't contribute my opinion in terms of the higher vision and goals and the better strategies in terms of logistics and efficient use of pipeline etc... Imagine if I had "did it myself"... Very likely as my first time experience I would probably make several mistakes and end up two weeks later with a much weaker or altogether broken network, how would this help anyone, myself or the project? Yes it would be a good learning experience for me personally, but how would it actually help the project at this time critical juncture? This is neither the time or the place for high handnesses of "its open source, do it yourself or go home" kind of rhetoric. Even if I did everything 100% and wasn't able to duplicate the results, instead of questioning the 3rd party dude's network and how he obtained it (and whether or not it was reproducible etc), the community would simply turn against me and point fingers like "you are a novice, this is your first trained network, surely you must have done many things wrongly etc". So knowing and admitting that I lacked the credibility factor in terms of experience in this particular regard is also one more reason that contributed to my line of thinking that asking others more knowledgeable for help might be the better overall course of action.

So in light of the original 3rd party dude's open hesitation to redo his work on a newer network with newer games (he admitted he has no idea if he were to redo if it would be much stronger or not, but his very ardent hesitation to start over on newer arch seems to stem from an ego thing or that of emotionally attachment to his preexisting work and not wishing to change and/or unwilling to let you and accept change or etc vs rather than technical reasons) I wanted to bring to the attention to other members of LZ community to try where the OP failed, in essence I was sending an open message calling upon all those who were much more technical and skilled at network training than me to give this another go.

In projects of any size, sometimes it is the people and personality holding it back, not just the technical aspect. Even very technicial people can be baised and stubborn. Sometimes the more one has contributed and the higher one is in the status of the community the harder it is to let go and do what is right as opposed to following the inertia of things. This is simply part of the human condition. And what is true and what is right and what is best isn't determined by community votes or that of the louder groupthink voice. A discussion should be based on the facts and the merits of the points themselves and not on personal attacks or the populus opinion. This is how we get stuck in local optimums.......

///////////

I don't believe gcp's all or nothing approach is correct. He insists on either sticking hard and fast with the 55% gating and/or getting rid of gating altogether and potentially allow LZ to regress by dropping in significant winrrates in a promote all scheme that would mandate the promotions of terrible networks like for example a 15% winrate! Many have proposed a lower bound gating of at least 50.0% to ensure the network doesn't regress, while lowering the threshold from 55% to 53% or even 52% as an adaptive gating if for example a certain amount of time or number of days have passed since the last new network promotion and there still hadn't been a 55% promoted new network.

Case in point, often we go almost a week without a new net, when in those instances we almost got a promotions right off the bat. Had we took the 53% or 54% and went with it, odds are that within a day or so we would have gotten another 53% or 54% and etc... and so on and so forth !! so within that one week's worth of time we would have progressed far more than sticking to a hard and fast brightline rule of strict 55% gating will not making any progress for an entire week! But another way, imagine a scenario in which the gating was set to something like 70% (an impossible nearly winnrate to ever achieve at this level!) so we'd still be stuck for months and months without a new net. We are at the level of progress where we must like a more granular approach and allow gating thresholds of less than 55%. But this DOES NOT MEAN we should do a "promote all" and allow the network to go backwards!

/////////////////////

why would @Ttl method not work unless weaker networks were promoted. if only equal or stronger networks are promoted at 50.0 how would it be any worse than allowing it to get weaker.... it doesnt have to be exact since over time the false positives and false negatives and under and over estimates average each other out anyhow (thus no need to run more than 400 games per test match) , but a straight promote of anything 50.0 or higher seems logically and intuively better than promoting everything unless there is data and literature that suggests otherwise. we dont have the data from deepmind to show what worked for them would work best in every circumstance.

afaik AGZ didnt have such a small training window either so in order to duplicate precisely everything would have to match on the way up.

///////////////////////

http://archive.is/QgvpB

apparently the dude didn't even train it himself he had his "friend" help him with training, who knows what the "friend" put into the network to make it strong when even gcp could not do so... (and to think this third hand account of a probably hybrid network almost made it into the official network without the community consent)

https://github.com/gcp/leela-zero/issues/167

and now the training data set has disappeared and can't be downloaded anymore...

HMMMMMMMM hmmmm......

I had to state this but things are getting strange... now GCP is adament the best approach is to LOWER the visits when just a week or two ago he wrote a huge typeup on why visits should be INCREASED...

I'm not sure what is going on here really

https://www.reddit.com/r/cbaduk/comment ... o/dww17sq/

Uberdude · #2

Haven't you been complaining that Leela Zero has stalled for the past few months, during which time it has kept improving?

Javaness2 · #3

Given the success of Gian-Carlos's software and the niceness of it being free, I really have a hard time to imagine the LeelaZero has a dark future. It is still clearly improving. Could it go faster? Maybe.

Gomoto · #4

Is there a program available to the public that is stronger than the current Leela Zero? ;-)

zermelo · #5

Dear hydrogenpi,

you have been ranting all over the internet about LeelaZero, and your rants do not make very good sense. If you have any ability to look at the situation objectively, you see that practically no one believes you or cares about what you are complaining about. Even if your points made sense, at this point the only thing you could do is to start your own project and forget about LeelaZero.

In any case, your obsession is seriously unhealthy. It would be best for you to forget computer go for a month and find something else to think about. Consider some professional help too.

tchan001 · #6

It's free and it is steadily improving. It's started by a single guy so how much time do you expect him to devote to a labor of love? I think we should be thankful and encouraging for his great efforts.

Thank you for your sharing Gian-Carlo Pascutto and all you have done for the go community.

sorin · #7

hydrogenpi7 wrote:

and now the training data set has disappeared and can't be downloaded anymore...

The all.sgf.xz file at this link https://sjeng.org/zero/ has all the training games.

My concerns about Leela Zero current status and the way its

Who is online