CPU vs GPU

Kirby · #1

How important is GPU to use Elf for analysis?

Presumably, you can get more play outs faster, but if Lizzie shows, say 50k playouts on a CPU vs. 50k playouts on a GPU, using Elf's network, is the evaluation the same?

I.e. if I let Elf think for long enough on a CPU, do I get the same quality analysis as GPU?

Tryss · #2

It doesn't matter if you use a GPU or a CPU. The result is the same (or at least, should be)

Obviously, a CPU will be much slower.

Kirby · #3

Thanks, that's what I thought. I haven't made the time to learn how elf works in detail, so I wanted to double check.

Has there been any study of move variation across playouts? At what number of playouts does elf typically converge to what it believes to be the best move? Does this number change as the game progresses?

jlt · #4

Not sure that with an ordinary computer we are able to see convergence.

viewtopic.php?p=234917#p234917

Kirby · #5

Hm, if there is no convergence maybe it's worthwhile to get a better machine with a GPU.

Gomoto · #6

I use my desktop at home (10k-100k if I am interested in a move.)

But even with low visit count on my laptop (<100) you are able to spot mistakes in your game. It is really a great tool already with low visits.

desktop: shows you strong moves that are difficult to refute

laptop: shows you good candidate moves and most (>95%) of your own mistakes

But if you have the spare money and you are very interested in go, a good hardware is a reasonable choice.

dfan · #7

Kirby wrote:

Thanks, that's what I thought. I haven't made the time to learn how elf works in detail, so I wanted to double check.

Has there been any study of move variation across playouts? At what number of playouts does elf typically converge to what it believes to be the best move? Does this number change as the game progresses?

My recollection is that doubling the number of visits generally results in approximately the same increase in strength, whether that is going from 100 visits to 200 or 100,000 visits to 200,000. There is no reliable point at which the networks will always converge to an answer that will not change with further visits.

Kirby · #8

Gomoto wrote:

laptop: shows you good candidate moves and most (>95%) of your own mistakes

I get the general sentiment that you are conveying, but I don't understand how this assertion can be confidently made with high confidence.

-How do you know there aren't mistakes it didn't catch?
-How do you know they are real mistakes if there is no convergence? Maybe another 10k playouts will give a different answer.
-Where does 95% come from?

I'm becoming warmer to the idea of using Elf, LZ, and crew for new ideas. But to give confident bounds on optimality is still a leap for me.

Tryss · #9

Kirby wrote:

-How do you know there aren't mistakes it didn't catch?

It's sure that there are mistake it didn't catch. If it didn't miss any mistakes, it would have solved go.

See it like that : if you eliminate all mistakes pointed by LZ (with high enough playouts), you'll be the strongest go player on earth. That seems good enough to me.

Bill Spight · **#10**

Kirby wrote:

Gomoto wrote:

laptop: shows you good candidate moves and most (>95%) of your own mistakes

I get the general sentiment that you are conveying, but I don't understand how this assertion can be confidently made with high confidence.

-How do you know there aren't mistakes it didn't catch?

The concept of a winrate (different from 1 or 0) is based upon making mistakes. Obviously, if both you and the program make the same mistake, it won't catch it.

Quote:

-How do you know they are real mistakes if there is no convergence?

In theory there is convergence in the limit, as the estimated winrates approach 1 or 0.

Quote:

-Where does 95% come from?

I'm becoming warmer to the idea of using Elf, LZ, and crew for new ideas. But to give confident bounds on optimality is still a leap for me.

The bots do not produce any error estimates or confidence bounds. I have met resistance to the idea of generating any.

Bill Spight · **#11**

Kirby wrote:

Has there been any study of move variation across playouts? At what number of playouts does elf typically converge to what it believes to be the best move? Does this number change as the game progresses?

dfan wrote:

My recollection is that doubling the number of visits generally results in approximately the same increase in strength, whether that is going from 100 visits to 200 or 100,000 visits to 200,000. There is no reliable point at which the networks will always converge to an answer that will not change with further visits.

For AlphaGo Teach, its team settled on a figure of 10,000,000 simulations as good enough. Not that they published any error rates or confidence intervals.

This is a general area I have been interested in for a long time. In 1968, as a student, I attended a conference of the New England Psychological Association. One speaker talked about his research in training physical performance of various complex tasks. For each task you might have different ways, different devices and algorithms to measure the person's performance. You could make your measurements more sensitive by altering the algorithms. He addressed the question of when to stop increasing the sensitivity of the algorithm.

In the case of the winrate estimates of Leela Zero, Elf, et al., we may consider them with different settings as different algorithms with different levels of sensitivity. One stopping criterion is to stop when the differences in winrates between successive settings become random. I have only compared Leela Zero at setting of 100k visits and 200k visits on a single game review. At those settings for that game it's not even close to meeting that criterion.

With neural networks my guess is that the main factor is the time taken to run the computer. Choose a setting that takes a reasonable amount of time and go with that.

Here is another approach. Suppose that with Elf we think that a delta of 10% or more is a likely mistake. Having run Elf with a certain number of visits, note the probable mistakes and run Elf with double that number of visits. If the delta for a certain move identified as a probable mistake is greater than its delta at the lower setting, take that as confirmation that it is actually a mistake.

Edit: Edited for clarity and improved memory.

Gomoto · **#12**

Perhaps my remarks were misleading.

I did not intend to sound like 95% of all mistakes made are eliminated.

I did intend to compare the usefullness of desktop (100k visits) and laptop (100 visits). To my astonishment I am able to identify with my laptop around 95% of the mistakes compared to the mistakes I can identify with my desktop.

(I review my games after real life tournament on the road with the laptop, and then I review further when I am back at home on my desktop. And I cant remember I missed a critical mistake on the laptop, that I only discovered on the desktop. That is all I wanted to say. Surly 100k visits are preferable, but already 100 visits are a great tool for reviewing games. 100 visits show you good candidate moves that you missed perhaps in your game.)

A further remark:
If I want to really evaluate a move with AI I never take the shown percentage for the move. I always enter the variations and evalute the move from the endpoints of the variations. I think for now it is the wrong concept to think if one increases the visits one get substantial more reliable result. The result is much more reliable when you continue the variations and compare these.

Kirby · **#13**

Thanks, Gomoto. That way of phrasing your thoughts makes a lot more sense to me.

At my level of play, using Elf to review a game on my cheap laptop is probably enough.

I have to admit, part of the reason i made this thread was to convince myself to buy a high end computer.

Sadly, I'll have to find another excuse.

Uberdude · **#14**

Kirby wrote:

I have to admit, part of the reason i made this thread was to convince myself to buy a high end computer.

Sadly, I'll have to find another excuse.

If you want Elf to not epic fail at ladders.

Kirby · **#15**

Haha, yeah. Btw, I am referring to Elf now because I started using the elf network weights.

Is there any reason to use LeelaZero's weights?

Elf is stronger, right? I suppose you can get more ideas on a position by using both.

Mike Novack · **#16**

We Have drifted away from the original question. Kirby wasn't asking how performance changed with change in number of play outs but whether performance depended on whether this was done with or without a GPU being used.

With a GPU the calculations can be performed faster, and since go is ordinarily played with time controls, time per move is the constraint, not number of playouts. But if I understand Kirby's question, he is saying "ignore time required" << time might not be as much a factor in analysis the way it is in play >>

Offhand, as a computer person, I'd say in theory no. If something is computable it is computable on a Turing Machine (or a Wang Machine, etc.) Just very, very slowly. BUT, and this is a big but, I do not know that these programs are in effect using the SAME algorithms when using a GPU as when without. If the same, just passing computations off to the GPU to do faster in parallel, I'd expect not difference.

Uberdude · **#17**

X playouts done in parallel on a GPU, or multiple GPUs, might actually give slightly different results to X playouts in series on a single CPU (and depends on # cores of CPU/GPU). This is apparently because splitting the work into batches might increase exploration, I imagine each batch being independent could mean different random fluctuations take precedence in each. See https://www.reddit.com/r/baduk/comments ... i/e1i687s/ and reply from Skuto below. I wonder if this is why Elf on LZ engine seems to not be exploratory enough, but perhaps real Elf run with their engine on their hardware setup for the matches against pros was better.

Uberdude · **#18**

Kirby wrote:

Is there any reason to use LeelaZero's weights?

Elf is stronger, right? I suppose you can get more ideas on a position by using both.

Yes. LZ is better (but still has problems) at ladders. My impression is that Elf tends to play bad moves in which it doesn't realise it can be captured in a working ladder, whereas LZ plays bad ataris trying to ladder stones which can escape but it assumes they won't. ie without playouts Elf assumes (almost) all ladders don't work whereas LZ assumes (almost) all ladders do work. Is the former a safer delusion? LZ needs fewer playouts than Elf to realise the truth about the ladder.

LZ has less extreme opinions than Elf and this can be helpful when reviewing lopsided games. If I have what I consider a moderate advantage in my mid dan game Elf might give me 99.2%. When the winrate is so high/low the quality of the analysis lowers. It doesn't turn to junk moves and stills plays sensibly unlike AlphaGo Lee, but there's not much difference between the best and mediocre moves. On the other hand LZ just gives me 65% so that's still a lot of game left before 99% land. So Elf is great if you want to find subtle direction mistakes in pro's openings, but I prefer LZ for my middlegame reviews.

Also Elf can be very jumpy in winrate particularly on low playouts. Sometimes I think it is quickly identitying big mistakes but othertimes it's not reading enough.

And yes by reading the gospel according to LZ and the gospel according to Elf you can see which bits they have in common and which differ and wonder where the truth lies.

P.S. Elf tends to like approaching 4-4s whereas LZ 3-3 invades more so if you don't like the invasions that's a plus for Elf.

CPU vs GPU

Who is online