How can I let Leela Zero use two rtx gpus ???

goame · Post by **goame** » Tue May 14, 2019 2:19 pm

Leela is using only one gpu but I have two gpus.
Exactly the same gpus.

{
"leelaz": {
"max-analyze-time-minutes": 60,
"analyze-update-interval-centisec": 10,
"network-file": "network.gz",
"max-game-thinking-time-seconds": 2,
"engine-start-location": ".",
"engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file",
"print-comms": false
},
"ui": {
"comment-font-size": 0,
"board-color": [
217,
152,
77
],
"shadow-size": 100,
"show-winrate": true,
"autosave-interval-seconds": -1,
"append-winrate-to-comment": true,
"fancy-board": true,
"show-captured": true,
"weighted-blunder-bar-height": false,
"win-rate-always-black": false,
"show-move-number": true,
"winrate-stroke-width": 3,
"show-next-moves": true,
"show-comment": true,
"show-leelaz-variation": true,
"theme": "default",
"min-playout-ratio-for-stats": 0,
"fancy-stones": true,
"resume-previous-game": false,
"window-size": [
3840,
2160
],
"new-move-number-in-branch": true,
"shadows-enabled": true,
"show-variation-graph": true,
"show-dynamic-komi": true,
"minimum-blunder-bar-width": 3,
"large-winrate": false,
"show-blunder-bar": true,
"only-last-move-number": 1,
"confirm-exit": false,
"show-status": true,
"handicap-instead-of-winrate": false,
"large-subboard": false,
"dynamic-winrate-graph-width": true,
"show-subboard": true,
"window-maximized": true,
"show-best-moves": true,
"board-size": 19
}
}

Calvin Clark · Post by **Calvin Clark** » Tue May 14, 2019 2:56 pm

And if you add --gpu 0 --gpu 1 to that line, does it change anything?

goame · Post by **goame** » Tue May 14, 2019 3:49 pm

Calvin Clark wrote:And if you add --gpu 0 --gpu 1 to that line, does it change anything?

Where exactly I need to paste this?

Calvin Clark · Post by **Calvin Clark** » Tue May 14, 2019 4:38 pm

goame wrote:
Where exactly I need to paste this?

Short answer:

These are command line parameters to the Leela Zero binary leelaz.exe itself, which is launched by Lizzie, so I suggest you try replacing the engine-command line with this:

Code: Select all

"engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1",

Long answer in another post to follow...

goame · Post by **goame** » Wed May 15, 2019 3:05 pm

Calvin Clark wrote:
goame wrote:
Where exactly I need to paste this?
Short answer:

These are command line parameters to the Leela Zero binary leelaz.exe itself, which is launched by Lizzie, so I suggest you try replacing the engine-command line with this:
Code: Select all
"engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1",
Long answer in another post to follow...

Does not work correctly.
Lizzie is open but
Leela Zero is loading... and loading and loading and loading
But I can use the x button and I can do moves on the board.

It looks like this now:
{
"leelaz": {
"max-analyze-time-minutes": 60,
"analyze-update-interval-centisec": 10,
"network-file": "network.gz",
"max-game-thinking-time-seconds": 2,
"engine-start-location": ".",
"engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1",
"print-comms": false
},
"ui": {
"comment-font-size": 0,
"board-color": [
217,
152,
77
],
"shadow-size": 100,
"show-winrate": true,
"autosave-interval-seconds": -1,
"append-winrate-to-comment": true,
"fancy-board": true,
"show-captured": true,
"weighted-blunder-bar-height": false,
"--gpu 0 --gpu 1 --gpu 2 --gpu 3": true,
"win-rate-always-black": false,
"show-move-number": true,
"winrate-stroke-width": 3,
"show-next-moves": true,
"show-comment": true,
"show-leelaz-variation": true,
"theme": "default",
"min-playout-ratio-for-stats": 0,
"fancy-stones": true,
"resume-previous-game": false,
"window-size": [
3840,
2160
],
"new-move-number-in-branch": true,
"shadows-enabled": true,
"show-variation-graph": true,
"show-dynamic-komi": true,
"minimum-blunder-bar-width": 3,
"large-winrate": false,
"show-blunder-bar": true,
"only-last-move-number": 1,
"confirm-exit": false,
"show-status": true,
"handicap-instead-of-winrate": false,
"large-subboard": false,
"dynamic-winrate-graph-width": true,
"show-subboard": true,
"window-maximized": true,
"show-best-moves": true,
"board-size": 19
}
}

Calvin Clark · Post by **Calvin Clark** » Thu May 16, 2019 1:46 am

Remove the entry that says "--gpu 0 --gpu 1 --gpu 2 --gpu 3": true, as that is not a valid parameter to Lizzie and try again.

If that does not work, then we need to see the output of running leelaz by itself. To show that, do this from a a command prompt the same directory Lizzie.jar resides:

If you are running on Windows:

Code: Select all

.\leela-zero\leelaz.exe -w network.gz

If you are running on Mac or Linux, use forward slashes:

Code: Select all

./leela-zero/leelaz.exe -w network.gz

On a two-GPU system, you should see output like this:

Code: Select all

E:\Lizzie>.\leela-zero\leelaz.exe -w network.gz
Using 2 thread(s).
RNG seed: 2961845906471187626
Leela Zero 0.16  Copyright (C) 2017-2018  Gian-Carlo Pascutto and contributors
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

BLAS Core: Haswell
Detecting residual layers...v1...256 channels...40 blocks.
Initializing OpenCL (autodetecting precision).
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.2.102
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Device ID:     1
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Half precision compute support: No.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.2.102
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Device ID:     1
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Half precision compute support: No.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Using OpenCL single precision (less than 5% slower than half).
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.2.102
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Device ID:     1
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Half precision compute support: No.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Setting max tree size to 3736 MiB and cache size to 415 MiB.

Passes: 0            Black (X) Prisoners: 0
Black (X) to move    White (O) Prisoners: 0

   a b c d e f g h j k l m n o p q r s t
19 . . . . . . . . . . . . . . . . . . . 19
18 . . . . . . . . . . . . . . . . . . . 18
17 . . . . . . . . . . . . . . . . . . . 17
16 . . . + . . . . . + . . . . . + . . . 16
15 . . . . . . . . . . . . . . . . . . . 15
14 . . . . . . . . . . . . . . . . . . . 14
13 . . . . . . . . . . . . . . . . . . . 13
12 . . . . . . . . . . . . . . . . . . . 12
11 . . . . . . . . . . . . . . . . . . . 11
10 . . . + . . . . . + . . . . . + . . . 10
 9 . . . . . . . . . . . . . . . . . . .  9
 8 . . . . . . . . . . . . . . . . . . .  8
 7 . . . . . . . . . . . . . . . . . . .  7
 6 . . . . . . . . . . . . . . . . . . .  6
 5 . . . . . . . . . . . . . . . . . . .  5
 4 . . . + . . . . . + . . . . . + . . .  4
 3 . . . . . . . . . . . . . . . . . . .  3
 2 . . . . . . . . . . . . . . . . . . .  2
 1 . . . . . . . . . . . . . . . . . . .  1
   a b c d e f g h j k l m n o p q r s t

Hash: 9A930BE1616C538E Ko-Hash: A14C933E7669946D

Black time: 01:00:00
White time: 01:00:00

Leela:

The key thing to note here, other than I haven't gotten around to updating to LZ 0.17, is that two Device IDs are detected: 0 and 1. Show us what you are seeing. Also, have you recently added the 2nd GPU after running Lizzie before? If so, it's not a bad idea to remove the tuning file 'leelaz_opencl_tuning' before doing the above steps. This will cause Leela Zero to regenerate the tuning file by running some benchmarks. That can take a couple of minutes the first time, so be patient. The tuning process should also see your 2nd GPU. If it does not, then we need to look more deeply at OS specifics.

goame · Post by **goame** » Thu May 16, 2019 11:11 am

Calvin Clark wrote:Remove the entry that says "--gpu 0 --gpu 1 --gpu 2 --gpu 3": true, as that is not a valid parameter to Lizzie and try again.

If that does not work, then we need to see the output of running leelaz by itself. To show that, do this from a a command prompt the same directory Lizzie.jar resides:

If you are running on Windows:

Code: Select all

.\leela-zero\leelaz.exe -w network.gz

If you are running on Mac or Linux, use forward slashes:

Code: Select all

./leela-zero/leelaz.exe -w network.gz

On a two-GPU system, you should see output like this:

Code: Select all

E:\Lizzie>.\leela-zero\leelaz.exe -w network.gz
Using 2 thread(s).
RNG seed: 2961845906471187626
Leela Zero 0.16  Copyright (C) 2017-2018  Gian-Carlo Pascutto and contributors
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

BLAS Core: Haswell
Detecting residual layers...v1...256 channels...40 blocks.
Initializing OpenCL (autodetecting precision).
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.2.102
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Device ID:     1
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Half precision compute support: No.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.2.102
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Device ID:     1
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Half precision compute support: No.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Using OpenCL single precision (less than 5% slower than half).
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 9.2.102
Platform profile: FULL_PROFILE
Platform name:    NVIDIA CUDA
Platform vendor:  NVIDIA Corporation
Device ID:     0
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Device ID:     1
Device name:   Tesla K80
Device type:   GPU
Device vendor: NVIDIA Corporation
Device driver: 397.44
Device speed:  823 MHz
Device cores:  13 CU
Device score:  1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Half precision compute support: No.
Loaded existing SGEMM tuning.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Setting max tree size to 3736 MiB and cache size to 415 MiB.

Passes: 0            Black (X) Prisoners: 0
Black (X) to move    White (O) Prisoners: 0

   a b c d e f g h j k l m n o p q r s t
19 . . . . . . . . . . . . . . . . . . . 19
18 . . . . . . . . . . . . . . . . . . . 18
17 . . . . . . . . . . . . . . . . . . . 17
16 . . . + . . . . . + . . . . . + . . . 16
15 . . . . . . . . . . . . . . . . . . . 15
14 . . . . . . . . . . . . . . . . . . . 14
13 . . . . . . . . . . . . . . . . . . . 13
12 . . . . . . . . . . . . . . . . . . . 12
11 . . . . . . . . . . . . . . . . . . . 11
10 . . . + . . . . . + . . . . . + . . . 10
 9 . . . . . . . . . . . . . . . . . . .  9
 8 . . . . . . . . . . . . . . . . . . .  8
 7 . . . . . . . . . . . . . . . . . . .  7
 6 . . . . . . . . . . . . . . . . . . .  6
 5 . . . . . . . . . . . . . . . . . . .  5
 4 . . . + . . . . . + . . . . . + . . .  4
 3 . . . . . . . . . . . . . . . . . . .  3
 2 . . . . . . . . . . . . . . . . . . .  2
 1 . . . . . . . . . . . . . . . . . . .  1
   a b c d e f g h j k l m n o p q r s t

Hash: 9A930BE1616C538E Ko-Hash: A14C933E7669946D

Black time: 01:00:00
White time: 01:00:00

Leela:

The key thing to note here, other than I haven't gotten around to updating to LZ 0.17, is that two Device IDs are detected: 0 and 1. Show us what you are seeing. Also, have you recently added the 2nd GPU after running Lizzie before? If so, it's not a bad idea to remove the tuning file 'leelaz_opencl_tuning' before doing the above steps. This will cause Leela Zero to regenerate the tuning file by running some benchmarks. That can take a couple of minutes the first time, so be patient. The tuning process should also see your 2nd GPU. If it does not, then we need to look more deeply at OS specifics.

Removing gpu 0 to 3 does not help.

Using OpenCL batch size of 5
Using 10 thread(s).
RNG seed: 5707416252940862580
Leela Zero 0.17 Copyright (C) 2017-2019 Gian-Carlo Pascutto and contributors
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

BLAS Core: Sandybridge
Detecting residual layers...v1...256 channels...40 blocks.
Initializing OpenCL (autodetecting precision).
Detected 2 OpenCL platforms.
Platform version: OpenCL 2.0 AMD-APP (2079.4)
Platform profile: FULL_PROFILE
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
Device ID: 0
Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
Device type: CPU
Device vendor: GenuineIntel
Device driver: 2079.4 (sse2,avx)
Device speed: 3200 MHz
Device cores: 6 CU
Device score: 520
Platform version: OpenCL 1.2 CUDA 10.0.150
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 1
Device name: GeForce RTX 2080 Ti
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 411.70
Device speed: 1545 MHz
Device cores: 68 CU
Device score: 1112
Device ID: 2
Device name: GeForce RTX 2080 Ti
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 411.70
Device speed: 1545 MHz
Device cores: 68 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: GeForce RTX 2080 Ti
with OpenCL 1.2 capability.
Half precision compute support: No.
Tensor Core support: Yes.
OpenCL: using fp16/half or tensor core compute support.

Started OpenCL SGEMM tuner.
Will try 380 valid configurations.
(1/380) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STR
M=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1145 ms (5150.0 GFLOPS)
(2/380) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STR
M=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0943 ms (6257.2 GFLOPS)
(5/380) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=128 NDIMB=16 NDIMC=32 NWG=32 SA=0 SB=
0 STRM=0 STRN=0 TCE=1 VWM=2 VWN=2 0.0921 ms (6404.4 GFLOPS)
(10/380) KWG=16 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=8 NDIMC=16 NWG=32 SA=0 SB=0
STRM=0 STRN=0 TCE=1 VWM=2 VWN=2 0.0915 ms (6449.3 GFLOPS)
(11/380) KWG=32 KWI=2 MDIMA=8 MDIMC=16 MWG=64 NDIMB=32 NDIMC=32 NWG=32 SA=0 SB=0
STRM=0 STRN=0 TCE=1 VWM=2 VWN=2 0.0823 ms (7165.0 GFLOPS)
(14/380) KWG=16 KWI=2 MDIMA=16 MDIMC=32 MWG=128 NDIMB=16 NDIMC=16 NWG=32 SA=0 SB
=0 STRM=0 STRN=0 TCE=1 VWM=2 VWN=2 0.0697 ms (8461.8 GFLOPS)
(25/380) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=128 NDIMB=32 NDIMC=32 NWG=32 SA=0 SB=0
STRM=0 STRN=0 TCE=1 VWM=2 VWN=2 0.0613 ms (9616.3 GFLOPS)
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Setting max tree size to 3736 MiB and cache size to 415 MiB.

Passes: 0 Black (X) Prisoners: 0
Black (X) to move White (O) Prisoners: 0

a b c d e f g h j k l m n o p q r s t
19 . . . . . . . . . . . . . . . . . . . 19
18 . . . . . . . . . . . . . . . . . . . 18
17 . . . . . . . . . . . . . . . . . . . 17
16 . . . + . . . . . + . . . . . + . . . 16
15 . . . . . . . . . . . . . . . . . . . 15
14 . . . . . . . . . . . . . . . . . . . 14
13 . . . . . . . . . . . . . . . . . . . 13
12 . . . . . . . . . . . . . . . . . . . 12
11 . . . . . . . . . . . . . . . . . . . 11
10 . . . + . . . . . + . . . . . + . . . 10
9 . . . . . . . . . . . . . . . . . . . 9
8 . . . . . . . . . . . . . . . . . . . 8
7 . . . . . . . . . . . . . . . . . . . 7
6 . . . . . . . . . . . . . . . . . . . 6
5 . . . . . . . . . . . . . . . . . . . 5
4 . . . + . . . . . + . . . . . + . . . 4
3 . . . . . . . . . . . . . . . . . . . 3
2 . . . . . . . . . . . . . . . . . . . 2
1 . . . . . . . . . . . . . . . . . . . 1
a b c d e f g h j k l m n o p q r s t

Hash: 9A930BE1616C538E Ko-Hash: A14C933E7669946D

Black time: 01:00:00
White time: 01:00:00

Leela:

Calvin Clark · Post by **Calvin Clark** » Thu May 16, 2019 3:27 pm

Ah, so your RTX device ids are 1 and 2, not 0 and 1. Try this instead, then:

"engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 1 --gpu 2",

wineandgolover · Post by **wineandgolover** » Sun Oct 04, 2020 4:40 pm

Calvin Clark wrote:Ah, so your RTX device ids are 1 and 2, not 0 and 1. Try this instead, then:

"engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 1 --gpu 2",

Hey Calvin, I'm not the OP, but I wanted to let you know that your solution in the last post worked for me, with two different GPU's, one internal and one external. So, it wasn't a waste of your time after all!

Anyway, thanks for more than doubling the speed of LZ for me.

Life In 19x19

How can I let Leela Zero use two rtx gpus ???

How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???

Re: How can I let Leela Zero use two rtx gpus ???