The KGS server keeps crashing

Comments, questions, rants, etc, that are specifically about KGS go here.
Suji
Lives in gote
Posts: 302
Joined: Wed May 19, 2010 2:25 pm
Rank: DDK
GD Posts: 0
KGS: Sujisan 12 kyu
OGS: Sujisan 13 kyu
Has thanked: 70 times
Been thanked: 8 times

The KGS server keeps crashing

Post by Suji »

What's wrong with the server tonight? It's apparently crashed twice.

Hopefully, it's nothing serious.
My plan to become an SDK is here.
User avatar
wms
Lives in gote
Posts: 450
Joined: Tue Apr 20, 2010 4:23 pm
GD Posts: 0
KGS: wms
Location: Portland, OR USA
Has thanked: 257 times
Been thanked: 287 times
Contact:

Re: The KGS server keeps crashing

Post by wms »

Not sure what it is. It is a crash that has been there since version 3.0.0; it's a bug that I haven't been able to track down. Usually it hits about once every 60 days, but in the past 24 hours it has hit 4 times instead.

It is possible that whatever causes the bug, somebody has decided to start doing that *A LOT*. But that would be strange, because I didn't think that the bug was caused by anything a user does. It is caused by memory corruption in my low level networking code. This code is extremely tricky, it is heavily multithreaded, written in C (the only part of the server that is), and uses the epoll Linux interface, which explains why there has been a bug that I've known about for 3+ years but haven't been able to fix.

If it keeps hitting...well, that will be useful information, but I'd rather get it another way of course.
User avatar
Ember
Lives with ko
Posts: 286
Joined: Sun May 09, 2010 5:32 am
Rank: EGF 3-4k - KGS 2-3k
GD Posts: 0
Online playing schedule: A schedule..? When hell freezes over... maybe. ^^;
Location: Germany
Has thanked: 146 times
Been thanked: 81 times

Re: The KGS server keeps crashing

Post by Ember »

Well, I can't access KGS at all now (from Germany), the homepage seems down, too.
At first when I tried to login I immediately got the message that the server might be down, after waiting a bit that message took some time to appear and now there is no message at all but I still can't get onto the server, the client seems to just keep on trying and trying and... :cry:

But I guess you're already working on it, wms, so I hope that you'll fix that bug soon and everything will be allright. :)

EDIT: Please ignore the message below.. :D

EDIT 2: Well, I guess it was a bit too early to triumph.. ^^; It crashed again.
Last edited by Ember on Sun May 23, 2010 2:19 am, edited 1 time in total.
Image
User avatar
wms
Lives in gote
Posts: 450
Joined: Tue Apr 20, 2010 4:23 pm
GD Posts: 0
KGS: wms
Location: Portland, OR USA
Has thanked: 257 times
Been thanked: 287 times
Contact:

Re: The KGS server keeps crashing

Post by wms »

Yeah, tried a reboot to see if that helped. It looks like "no."

The server hasn't had an upgrade for months. So something new changed outside the server that made this bug show up a lot more often. No idea what.
User avatar
Phelan
Gosei
Posts: 1449
Joined: Tue Apr 20, 2010 3:15 pm
Rank: KGS 6k
GD Posts: 892
Has thanked: 1550 times
Been thanked: 140 times

Re: The KGS server keeps crashing

Post by Phelan »

There were some authentication problems with the desktop version a while back. Are you using that? either way, a screenshot or the details of the popup might help.
a1h1 [1d]: You just need to curse the gods and defend.
Good Go = Shape.
Associação Portuguesa de Go
User avatar
CarlJung
Lives in gote
Posts: 429
Joined: Wed Apr 21, 2010 1:10 pm
Rank: SDK
GD Posts: 0
KGS: CarlJung
Location: Sweden
Has thanked: 101 times
Been thanked: 73 times

Re: The KGS server keeps crashing

Post by CarlJung »

Helel wrote:...swedish text in picture...


Ha ha, visste inte att du var svensk.
tj86430
Gosei
Posts: 1348
Joined: Wed Apr 28, 2010 12:42 am
Rank: FGA 7k GoR 1297
GD Posts: 0
Location: Finland
Has thanked: 49 times
Been thanked: 129 times

Re: The KGS server keeps crashing

Post by tj86430 »

Since my line is quite slow, I'd really appreciate if the pictures weren't several megabytes...
Offending ad removed
User avatar
EdLee
Honinbo
Posts: 8859
Joined: Sat Apr 24, 2010 6:49 pm
GD Posts: 312
Location: Santa Barbara, CA
Has thanked: 349 times
Been thanked: 2070 times

Re: The KGS server keeps crashing

Post by EdLee »

Helel wrote:I only did it to annoy you. :twisted: It will not become a habit. ;-)
Helel, you can also try:
(1) Alt+"Prnt Scrn" to grab only the error window.
(2) Paste it in an editor like Irfanview -- http://www.irfanview.com/
(3) Reduce the screenshot size.
(4) Save as jpeg at 80% quality (see attached).
This usually reduces the file size by over 90%.
Or, you can TYPE all the error messages by hand. ;-)
Attachments
xxx.jpg
xxx.jpg (61.49 KiB) Viewed 16293 times
xxx1.jpg
xxx1.jpg (55.81 KiB) Viewed 16293 times
tj86430
Gosei
Posts: 1348
Joined: Wed Apr 28, 2010 12:42 am
Rank: FGA 7k GoR 1297
GD Posts: 0
Location: Finland
Has thanked: 49 times
Been thanked: 129 times

Re: The KGS server keeps crashing

Post by tj86430 »

Helel wrote:Svenska talas ju till och med av skåningar och annat slödder. :twisted:

Ja, även nästan tio procent av finnar talar svenska som modersmål (jag är dock inte en av dom).
Offending ad removed
Suji
Lives in gote
Posts: 302
Joined: Wed May 19, 2010 2:25 pm
Rank: DDK
GD Posts: 0
KGS: Sujisan 12 kyu
OGS: Sujisan 13 kyu
Has thanked: 70 times
Been thanked: 8 times

Re: The KGS server keeps crashing

Post by Suji »

wms wrote:Not sure what it is. It is a crash that has been there since version 3.0.0; it's a bug that I haven't been able to track down. Usually it hits about once every 60 days, but in the past 24 hours it has hit 4 times instead.

It is possible that whatever causes the bug, somebody has decided to start doing that *A LOT*. But that would be strange, because I didn't think that the bug was caused by anything a user does. It is caused by memory corruption in my low level networking code. This code is extremely tricky, it is heavily multithreaded, written in C (the only part of the server that is), and uses the epoll Linux interface, which explains why there has been a bug that I've known about for 3+ years but haven't been able to fix.

If it keeps hitting...well, that will be useful information, but I'd rather get it another way of course.


Hmmm...Interesting. Hopefully, you can find it and fix it.

On a lighter note, there's two quotes that I thought of.

1. "If debugging is the process of removing bugs from a program, then programming is the process in which bugs are introduced to the program."

2. "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

@WMS: In what way would you like to receive the information?
My plan to become an SDK is here.
User avatar
wms
Lives in gote
Posts: 450
Joined: Tue Apr 20, 2010 4:23 pm
GD Posts: 0
KGS: wms
Location: Portland, OR USA
Has thanked: 257 times
Been thanked: 287 times
Contact:

Re: The KGS server keeps crashing

Post by wms »

When there's a bug, it's best if the bug shows up when I test so that I can fix it there.

But this bug happens so rarely, and I don't know how to make it happen, so I only get info about it from crashes...and very, very little info even then. All I've been able to pin down is that variables are getting utter nonsense in them. A boolean, for example, will have a number in it instead of a 0 or 1. I suspect that I'm walking past the end of an array in the network code, or something like that, but I can't figure out where that could be happening.
User avatar
CarlJung
Lives in gote
Posts: 429
Joined: Wed Apr 21, 2010 1:10 pm
Rank: SDK
GD Posts: 0
KGS: CarlJung
Location: Sweden
Has thanked: 101 times
Been thanked: 73 times

Re: The KGS server keeps crashing

Post by CarlJung »

Helel wrote:
wms wrote:When there's a bug, it's best if the bug shows up when I test so that I can fix it there.

But this bug happens so rarely, and I don't know how to make it happen, so I only get info about it from crashes...and very, very little info even then. All I've been able to pin down is that variables are getting utter nonsense in them. A boolean, for example, will have a number in it instead of a 0 or 1. I suspect that I'm walking past the end of an array in the network code, or something like that, but I can't figure out where that could be happening.


:twisted: :evil: :twisted: :evil: :twisted: :evil: :twisted: :evil: :twisted: :evil: :twisted:
Ever heard of open source code...
:twisted: :evil: :twisted: :evil: :twisted: :evil: :twisted: :evil: :twisted: :evil: :twisted:

Have fun debugging! :D


That wouldn't really change the nature of the bug, and there would still only be wms who has access to the server where the problem occurs. More eyes on the problem, yes, but that's it.
User avatar
CarlJung
Lives in gote
Posts: 429
Joined: Wed Apr 21, 2010 1:10 pm
Rank: SDK
GD Posts: 0
KGS: CarlJung
Location: Sweden
Has thanked: 101 times
Been thanked: 73 times

Re: The KGS server keeps crashing

Post by CarlJung »

Helel wrote:Ahh, so the bug is in no way related to anything wms has coded. My bad. :oops:


It's quite possible that it is, it remains to be seen. But even so, open sourceing the code wouldn't make it any easier to debug. It's on the server the error occurs, and you can't give everyone access to it to tinker away at their hearts content.
User avatar
CarlJung
Lives in gote
Posts: 429
Joined: Wed Apr 21, 2010 1:10 pm
Rank: SDK
GD Posts: 0
KGS: CarlJung
Location: Sweden
Has thanked: 101 times
Been thanked: 73 times

Re: The KGS server keeps crashing

Post by CarlJung »

wms,

I'm sure you have a test server. Have the error ever occurred on that one? Can't we fill it with a few thousand weakbots/randombots that all play blitz in order to simulate some load? I have 10MBit upload that mostly sits idle and a quite powerful computer. I'm sure others have similar setups. That could potentially be a way forward.
tj86430
Gosei
Posts: 1348
Joined: Wed Apr 28, 2010 12:42 am
Rank: FGA 7k GoR 1297
GD Posts: 0
Location: Finland
Has thanked: 49 times
Been thanked: 129 times

Re: The KGS server keeps crashing

Post by tj86430 »

CarlJung wrote:
Helel wrote:Ahh, so the bug is in no way related to anything wms has coded. My bad. :oops:


It's quite possible that it is, it remains to be seen. But even so, open sourceing the code wouldn't make it any easier to debug. It's on the server the error occurs, and you can't give everyone access to it to tinker away at their hearts content.

What I remember from my merry days of coding and debugging C/C++ (which is what I suspect the code in question is), this kind of bug isn't often caught by debugging when the actual error occurs. The problematic code may have been executed well in advance. If the culprit is something wms wrote, then it might help to have several people look at it. Of course, if the bug may be virtually anywhere, it won't probably help unless it can be narrowed down. Of course one theoretical possibility is to run everything in debugger with watches guarding the memory that will eventually be overwritten, but that is (or at least used to be) much too slow. Perhaps debugging tools have improved since I coded for living.
Offending ad removed
Post Reply