DRIVERS, SOFTWARE, CORES and LIBRARIESWe have our new Windows computer with a dedicated graphics card. Firstly, install its driver. For a Nvidia graphics card, we can install the Gaming driver, which is updated more frequently and provides a few additional features for 3D games, or the Studio driver, which emphasises long stability and provides a few additional features for work software. Whichever kind of driver we prefer, it must fit the operating system, such as Windows 11 64-bit, and usually should be the newest stable version.
We need both an engine and a GUI (graphics user interface). An engine is a go AI (artificial intelligence) software, such as KataGo, and generates moves. A GUI, such as Lizzie or KaTrain, displays the go board. The GUI calls the engine so that both run simultaneously and interact with each other. We interact with the GUI.
Every graphics card supports OpenCL as an application interface between software and graphics card. AMD graphics cards only support OpenCL. Nvidia graphics cards support OpenCL and have CUDA cores, tensor cores and RT (raytracing) cores. KataGo supports OpenCL, CUDA cores and tensor cores. KataGo supports OpenCL easily. It needs little more than its OpenCL.dll library file. It can also use tensor cores, but we need not tell it to do so. If KataGo shall use CUDA cores, it needs both Nvidia's CUDA libraries and Nvidia's CuDNN libraries, whose installation we discuss later. If KataGo shall make best use of tensor cores, it needs Nvidia's TensorRT libraries, whose installation we discuss later. Hence, there can be different kinds of cores and different kinds of libraries, which enable software to use some of the cores at all. Some libraries enable more efficient use of particular cores.
GETTING STARTEDWe have our new computer and want to quickly test whether its dedicated graphics card allows us to run a GUI and KataGo. We might not want to start by testing all - OpenCL, CUDA and tensor cores, CUDA, CuDNN and TensorRT libraries - at once. A convenient start is the Baduk Megapack at
https://github.com/wonsiks/BadukMegapack , which comes as an installer of currently version 4.18.0 (on 2023-06-12) for Windows 11 64-bit, installs a couple of GUIs and instances of KataGo.
For now and the sake of simplicity, we keep the recommended installation directory C:\baduk and are logged in with a Windows adminstrator user. Some go software programmers are at home in the Linux world and do not respect the Windows security conventions of using applications installed to the write-protected C:\Program Files or C:\Program Files (x86) directories as a Windows standard user. Later, such go softwares want to write files in their installation directories. We postpone related management of Windows security but might disable internet connections when logged in as a Windows administrator user.
The installation process of Baduk Megapack comes with a surprise: a command line window protocols various things and interacts with us so that it can initially tune KataGo and adjust it at least roughly to our dedicated graphics card. For now, most of these settings can be answered somehow. If we can leave some parameter empty at its default, we just do so. However, there is one absolutely essential question. One or a few graphic card devices are listed and each has a number 0, 1, etc. We write and only write the stated Device number of our dedicated graphics card. For example, the text Found OpenCL Device 1 = RTX 4070 indicates that we must write 1 . (If you have several dedicated graphics cards, list them. I would, however, not include the integrated graphics card so that your CPU remains cooler and the software has fewer reasons to exhibit any bugs. Overclockers may have a different opinion.) After answering the query, we are patient and watch the initial tuning progress.
After installation of Baduk Megapack, we can try KaTrain. The Hamburger menu (click on three horizontal bars) gives access to General & Engine Settings. Click Download KataGo version and select the OpenCL instance of KataGo, whose path is C:\baduk\lizzie\katago.exe . Click Download Models and choose one of the *.gz or *.bin.gz files. Afterwards, there should be entries in the three rows Path to KataGo executable, Path to KataGo config file and Path to KataGo model file. Override Engine Command remains empty for now. Adjust the Maximum time for analysis. (A small value lets us see an operating AI quickly while a large value lets us check GPU usage and running processes in suitable tools.) Click on Update Settings and possibly wait for a few minutes. Close this tab by ESC. If KaTrain freezes at this moment, kill its process and restart KaTrain. Then we may find that General & Engine Settings have the right entries. In the Player Setup, set, for example, Black Human and White AI; press ESC. Click on the board and the engine should reply with its moves. The Windows task manager (CTRL + ALT + DEL) notices some temperature increase of a dedicated graphics card or its load for Furmark but has trouble noticing the GPU load of advanced software. We can use a tool, such as HWiNFO64, to monitor GPU load. We know that the go engine works and uses our dedicated graphics card if HWiNFO64 indicates its 50% ~ 100%, typically 94% ~ 96% load while pondering.
KATAGO INSTALLATIONFor a proper installation of KataGo, we go to its webpage
https://github.com/lightvector/KataGo and download from
https://github.com/lightvector/KataGo/releases its Windows versions for OpenCL and, if needed for a Nvidia graphics card, CUDA and TensorRT. Typically, the download files come as compressed ZIP archives. Here, we face a minor difficulty: do we need the files with or without bs29? bs29 is for board sizes up to 29x29. For us starters, we avoid such extras and chose the files without bs29. The download files have names like these:
Code:
katago-v1.13.0-opencl-windows-x64.zip
katago-v1.13.0-cuda11.2-windows-x64.zip
katago-v1.13.1-trt8.5-cuda11.2-windows-x64.zip
These are meaningful file names but we must be able to decipher them. 1.13.0 or 1.13.1 is KataGo's version number. x64 denotes 64-bit Windows. opencl is the KataGo version for OpenCL. cuda11.2 is the KataGo version for the CUDA and CuDNN libraries in their versions 11.2. trt8.5-cuda11.2 is the KataGo version for the TensorRT library in its version 8.5, which relies on installed CUDA and CuDNN libraries in their versions 11.2. Although a KataGo download file contains some DLLs, it does not contain the CUDA, CuDNN and TensorRT library files, which we must seek separately from other sources.
It can sometimes happen that a download page of KataGo does not contain all three versions. In this case, we must visit several subpages of KataGo's webpage to get them all.
Furthermore, on KataGo's webpage, we find and download from
https://katagotraining.org/networks/ a model file, which is a pretrained neural net. Usually, newer model files are better than older model files. However, there is the additional aspect that models come in different block sizes. In the early days, larger block sizes indicated stronger models. Currently, this is not the case but the block size 18 is the strongest for typical usage. We recognise it by b18 early in its name, such as kata1-b18c384nbt-s6386600960-d3368371862.bin.gz . Model files are compressed as *.bin.gz or *.gz. With our use, we do not decompress them - instead, we simply use them. The tail of a long file name might be just random digits. However, for our convenience, we may rename the file to, say, b18.bin.gz .
Install the contents of each ZIP file to its own directory. That is, use, for example, the Windows Explorer to unpack a particular ZIP and then copy the contained files and any folders to its installation folder. For example, create the directories
Code:
C:\katago_OpenCL
C:\katago_CUDA
C:\katago_TensorRT
and install the appropriate files to their directory. Furthermore, copy the model file b18.bin.gz to each of the three directories. This wastes disk space but later eases calling the model file. Alternatively, we can store model files in their separate directory and write its different path when calling one of them.
Before we can use either of these three versions of KataGo, we need three to five further preparation steps:
1) For KataGo TensorRT, get another software containing libraries.
2) For KataGo CUDA or For KataGo TensorRT, copy the missing library files.
3) Initial benchmark of KataGo.
4) Initial tuning of KataGo.
5) In a GUI, set the command line for calling KataGo.
The following describes this procedure for each version of KataGo.
KATAGO OpenCLThe installation directory, say C:\katago_OpenCL, already contains a copy of the needed OpenCL.dll library file. Therefore, we continue with step 3 of the procedure.
Open the Windows command line, that is C:\Windows\System32\cmd.exe . Go to the right directory using the command
Code:
cd \katago_OpenCL
There, execute the following command (or adjust the file name if you have chosen a different one):
Code:
katago.exe benchmark -model b18.bin.gz
katago calls the program katago.exe in the current directory. The parameter benchmark does not carry a minus sign because it does not call any object. The parameter -model carries a minus sign because it calls an object: our model file. Therefore, KataGo can benchmark for the model file that will be used later when we will use KataGo. Execution of the command takes a while. Eventually, KataGo creates the subdirectories and file \gtp_logs, \KataGoData and \KataGoData\opencltuning\tune11_gpuNVIDIAGeForceRTX4070_x19_y19_c384_mv11.txt or a similar file name.
As step 4 of the procedure, we are still in the same directory and execute the following command:
Code:
katago.exe genconfig -model b18.bin.gz -output gtp_custom.cfg
The parameter genconfig calls the tuning function for our model file b18.bin.gz and will eventually create the config file gtp_custom.cfg in the same directory. First, the tuning interacts with us, as we already know. When asked, we must specify the right device. Now, however, for another question, we must also choose a useful number of visits. On a modern grapgics card, this might be:
Code:
10000
When the tuning starts, we notice whether it proceeds smoothly or is way too slow. If necessary, we can interrupt execution by CTRL C and execute the command afresh with then a much smaller number of visits. Otherwise, we are patient and let the tuning do its job. It writes the appropriate values in the created config file. We close the command line window.
As step 5 of the procedure, we start Lizzie or KaTrain to set the command line for calling KataGo. If we use Lizzie, we go to Settings | Engine | Engine 0, delete the earlier command line and enter this command line:
Code:
C:\katago_OpenCL\katago.exe gtp -model C:\katago_OpenCL\b18.bin.gz -config C:\katago_OpenCL\gtp_custom.cfg
For more variation on Lizzie's syntax of the command line, see
https://www.lifein19x19.com/viewtopic.php?f=18&t=19196 . Lizzie is one of the GUIs communicating to KataGo in the gtp mode. Therefore the command has the gtp parameter and uses the gtp_custom.cfg config file. Optionally, alter Max Game Thinking Time. Click OK. Choosen Game | NewGame(N) and so on. You should be able to play against the AI. If necessary, close and restart Lizzie.
If we use KaTrain, in General & Engine Settings, we set this Override command line:
Code:
C:\katago_OpenCL\katago.exe analysis -model C:\katago_OpenCL\b18.bin.gz -config C:\katago_OpenCL\analysis_config.cfg
For more variation on KaTrain's syntax of the command line, see
https://www.lifein19x19.com/viewtopic.php?f=18&t=19195 . KaTrain is one of the GUIs communicating to KataGo in the analysis mode. Therefore the command has the analysis parameter and uses the analysis_config.cfg config file. Click Update Settings and ESC. Unless closing / killing the process and restarting KaTrain is necessary, you should be able to start a new game and play against the AI.
KATAGO CUDAOur installation directory is C:\katago_CUDA . We start at step 2 of the procedure. We copy from C:\baduk\lizzie to C:\katago_CUDA the following five files, of which KataGo CUDA uses all:
Code:
cublas64_11.dll
cublasLt64_11.dll
cudnn_cnn_infer64_8.dll
cudnn_ops_infer64_8.dll
cudnn64_8.dll
In the benchmark step 3 of the procedure, we open the Windows command line, change directory to C:\katago_CUDA and execute this command:
Code:
katago.exe benchmark -model b18.bin.gz
KataGo creates the subdirectory \gtp_logs .
As step 4 of the procedure, we are still in the same directory and execute the following command, during whose dialog we write the right CUDA Device number and afterwards set the number of visits to, for example, 10000:
Code:
katago.exe genconfig -model b18.bin.gz -output gtp_custom.cfg
KataGo creates the files C:\katago_CUDA\gtp_custom.cfg and, for example, C:\katago_CUDA\gtp_logs\20230609-072808-6A18D901.log .
In step 5 of the procedure, we tell Lizzie the Engine command line
Code:
C:\katago_CUDA\katago.exe gtp -model C:\katago_CUDA\b18.bin.gz -config C:\katago_CUDA\gtp_custom.cfg
or tell KaTrain the Override command line
Code:
C:\katago_CUDA\katago.exe analysis -model C:\katago_CUDA\b18.bin.gz -config C:\katago_CUDA\analysis_config.cfg
KATAGO TensorRTOur installation directory is C:\katago_TensorRT . It also needs files that are not readily available yet. We begin with step 1 of the procedure and download LizzieYZY as a separate software available at
https://github.com/yzyray/lizzieyzy . The downloaded file is a ZIP archive, which we unpack in the Windows Explorer.
As step 2 of the procedure, the unpacked archive contains the subfolder \katago_tensorRT , from which we copy the following files to C:\katago_TensorRT :
Code:
cublas64_11.dll *
cublasLt64_11.dll *
cudart64_110.dll
cudnn_cnn_infer64_8.dll
cudnn_ops_infer64_8.dll *
cudnn64_8.dll *
msvcr110.dll
nvinfer.dll *
nvinfer_builder_resource.dll
nvrtc64_112_0.dll
nvrtc-builtins64_114.dll
* means that I have seen in a ProcessExplorer that KataGo uses these libraries. I do not know yet whether the other copied files are also needed. They do, however, no harm except for occupying disk space possibly unnecessarily.
As step 3 of the procedure, we open the Windows command line, change directory to C:\katago_TensorRT and execute this command:
Code:
katago.exe benchmark -model b18.bin.gz
As step 4 of the procedure, we are still in the same directory and execute the following command, during whose dialog we write the right GPU Device number and afterwards set the number of visits to, for example, 10000:
Code:
katago.exe genconfig -model b18.bin.gz -output gtp_custom.cfg
In step 5 of the procedure, we tell Lizzie the Engine command line
Code:
C:\katago_TensorRT\katago.exe gtp -model C:\katago_TensorRT\b18.bin.gz -config C:\katago_TensorRT\gtp_custom.cfg
or tell KaTrain the Override command line
Code:
C:\katago_TensorRT\katago.exe analysis -model C:\katago_TensorRT\b18.bin.gz -config C:\katago_TensorRT\analysis_config.cfg
On the first start, KataGo TensorRT needs two minutes or more in KaTrain or 30 seconds or more in Lizzie. At later starts, the delay is a few seconds in KaTrain and 10 seconds in Lizzie on my computer. On recent computers, the delays may be worth it because usually KataGo TensorRT is the fastest version of KataGo during go move generation by far.
NVIDIA LIBRARIESSo far, we have created some duplicate files. Some of them are huge so much disk space is wasted. Furthermore, at least on my computer, KataGo CUDA has been slow so far and one of the possible reasons is a too old library file. Instead of manually copying individual library files, the usual but even more complicated way seeks them from Nvidia's webpage, where first one must register. I have downloaded tremendous installers and installation manuals from there but not installed any yet. We can get Nvidia CUDA libraries from
https://developer.nvidia.com/cuda-zone , Nvidia CuDNN libraries from
https://developer.nvidia.com/cudnn and Nvidia TensorRT libraries from
https://developer.nvidia.com/tensorrt . We need local executables for Windows 11 or at least Windows 10 of the right versions. I do not know yet whether the newest versions are suitable or whether we need the exact versions declared in KataGo's download file names. If you see GA and EA variants of a version, GA seems to be the revision. For a version, Nvidia often offers several subversions, of which we might choose the latest. It is possible that Nvidia's installers also mess with drivers or install developer softwares, which we players do not need. We might find installed libraries and copy them or refer to them by a Windows PATH environment variable. If we look at the individual library files above, we notice some numbers in the file names, which might denote version numbers.
MISCELLANEOUSIf this manual enables you to install some GUIs and the three versions of KataGo in one day, this would be about ten times faster than I needed without it. However, we are not done yet. Further tuning of each version and additional care for the analysis variant are needed. We can run the genconfig tuning several times with different numbers, such as 5000, 10000, 20000, 30000, of visits, save the created config files under different file names, and compare or modify the values in these config files. We might also let analyse board positions and compare numbers of visits to judge about different config parameters.