It is currently Tue Apr 30, 2024 4:03 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 28 posts ]  Go to page 1, 2  Next
Author Message
Offline
 Post subject: Mass downloader??? Help!
Post #1 Posted: Mon Oct 05, 2015 8:32 pm 
Beginner

Posts: 13
Liked others: 0
Was liked: 0
Rank: KGS7d
In Fuseki Info for Tygem, there are over 200,000 games in database.

What I want to know is-how did all these games were collected?

I'm in search for a way to mass-download all the recent datas.

I asked baduk.org via e-mail but received no answer.

I also asked Tygem about the database but they said only way they
know of is to download each games manually... which would take scary amount of time and energy.



Also, what pattern-search program do you recommend? Kombilo? Drago? SmartGo? I'm willing to pay big bucks for the best program if I need to. :bow:

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #2 Posted: Mon Oct 05, 2015 8:43 pm 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
Either curl or wget will work. Look them up.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #3 Posted: Mon Oct 05, 2015 9:08 pm 
Beginner

Posts: 13
Liked others: 0
Was liked: 0
Rank: KGS7d
Kirby wrote:
Either curl or wget will work. Look them up.


WOW youre my savior man. Though I didnt look em up yet I'm sure they will work like charm!

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #4 Posted: Mon Oct 05, 2015 9:25 pm 
Beginner

Posts: 13
Liked others: 0
Was liked: 0
Rank: KGS7d
Kirby wrote:
Either curl or wget will work. Look them up.


Well.. Im such a terible tech guy this seems scary.

Have you used them before to download Tygem game data? If so, any tips on that matter would be greatly appreciated :salute:

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #5 Posted: Mon Oct 05, 2015 9:31 pm 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
MP4Life wrote:
Kirby wrote:
Either curl or wget will work. Look them up.


Well.. Im such a terible tech guy this seems scary.

Have you used them before to download Tygem game data? If so, any tips on that matter would be greatly appreciated :salute:


These programs can be used to download a url, so just make a script to download the page, parse the links, and download each link.

If the site has robots.txt file, you might want to follow it.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #6 Posted: Mon Oct 05, 2015 11:25 pm 
Lives in sente

Posts: 923
Location: UK
Liked others: 72
Was liked: 479
Rank: 5 dan
KGS: macelee
I hate to see people attempting doing a mass download without consulting the owner of the service. At the moment I even hate to see people discussing it on this forum.

When you do this, things can go wrong. And often things will go wrong.

Last night at approximately 22:48 UK time, go4go.net was apparently brought down by a badly written script. My early analysis shows that the script sent 150 or so requests to my server within minutes to grab some data intensive pages, causing the server to run out of memory. The traffic was from a host in Amazon EC2's network 52.89.xxx.xxx (damn, I have to protect the privacy of whoever did this).


This post by macelee was liked by 2 people: Bantari, LocoRon
Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #7 Posted: Tue Oct 06, 2015 12:01 am 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
macelee wrote:
I hate to see people attempting doing a mass download without consulting the owner of the service. At the moment I even hate to see people discussing it on this forum.


1. I didn't say he should make a script without consulting the owner of the service, and 'MP4Life' didn't say that he was doing this either.
2. IMO, contacting the owner of the service is a courtesy more than an obligation.
3. I also suggested following 'robots.txt', even though that's not an obligation.

Since we're all friends here in the go community, yeah, sure. Try to be courteous.

But "I hate to see people" trying to suggest that I've done something morally wrong here by bringing up to 'MP4Life' publicly available information.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #8 Posted: Tue Oct 06, 2015 12:04 am 
Lives in sente

Posts: 923
Location: UK
Liked others: 72
Was liked: 479
Rank: 5 dan
KGS: macelee
Kirby thanks for your reply and I withdraw my comments. Please understand my frustration - having to waste one hour on such matter the first thing in the morning.

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #9 Posted: Tue Oct 06, 2015 1:34 am 
Gosei
User avatar

Posts: 2011
Location: Groningen, NL
Liked others: 202
Was liked: 1087
Rank: Dutch 4D
GD Posts: 645
Universal go server handle: herminator
Kirby wrote:
But "I hate to see people" trying to suggest that I've done something morally wrong here by bringing up to 'MP4Life' publicly available information.


Just because information is publicly available does not automatically make it morally right to point people to it. If someone goes online and asks for information on how to make their own fireworks, I'd expect people to at least include a warning on the dangers of doing so when they're providing information. IMO, it is pretty obvious that MP4Life has very little expertise on the matter, and if he does manage to cobble together a mass downloader he is quite likely to severely impact someone's server and make someone's morning miserable. I think it is appropriate to warn him of the risks and suggesting alternatives, rather than just throwing some information his way and washing your hands of it.


This post by HermanHiddema was liked by: Bantari
Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #10 Posted: Tue Oct 06, 2015 2:08 am 
Lives in gote
User avatar

Posts: 308
Liked others: 54
Was liked: 71
Rank: EGF 5k Foxy 2k
Suggesting writing a curl script to a "terrible tech guy" is a bit like asking them to lick their own elbows (maybe it's not impossible, but it's entertaining to watch). How about a code review session? :cool:

Curl is a great tool though. I use it to download .asx files from Wbaduk so I can see the hidden url for lecture videos which I capture with VLC for offline viewing.

Quote:
I think it is appropriate to warn him of the risks and suggesting alternatives


Like, Googling what a web scraper is?

_________________
12k: 2015.08.11; 11k: 2015.09.13; 10k: 2015.09.27; 9k: 2015.10.10; 8k: 2015.11.08; 7k: 2016.07.10 6k: 2016.07.24 5k: 2018.05.14 4k: 2018.09.03 3k: who knows?

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #11 Posted: Tue Oct 06, 2015 3:34 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
MP4Life wrote:
Also, what pattern-search program do you recommend? Kombilo? Drago? SmartGo? I'm willing to pay big bucks for the best program if I need to. :bow:


fuseki.info seems associated with BiGo (http://bigo.baduk.org/index.html), so why not buy that to get the huge database and pattern searching software*? Or do you want the very latest Tygem games from yesterday they don't have? Or actively want the intellectual/programming challenge of writing a web scraper?

*An answer might be because they pinched it from GoGoD or some other source (not an accusation with any evidence, just hypothesising) and you might not to wish to give your money to people you consider ethically dubious (e.g. the MoyoGo saga).

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #12 Posted: Tue Oct 06, 2015 6:35 am 
Lives in sente

Posts: 1037
Liked others: 0
Was liked: 181
Look, I've said my piece on related topics. It doesn't even have to be an ERROR in a script, just an error in conception. So might work OK in testing (making 100 access requests against that database) but bring that service to a halt when dumping in 100,000 (the server might be able to handle 40,000 per day if each takes a couple seconds).

a) You should be experienced before tackling something like this, ideally real world experience.

b) It is not just courtesy. Those whose database it is have a perfect right to consider something bringing down their database an attack. They may have a way for you to do this safely, and may be willing to let you do that (go against their backup loaded copy, or download their flat backup copy and you load on your own hardware).

In the real world I came from, we had full size "test" versions of our production databases, and it's the test version that programmers used when developing software, so if something went wrong, it didn't affect production.

Understand? When making a database available to public access they mean to people sitting at a terminal entering keystrokes on a keyboard. hey can estimate how many people and how fast a person can type, and size the system to be adequate for that volume of activity. But a program could be firing off requests MUCH faster than that.


This post by Mike Novack was liked by 2 people: Bantari, LocoRon
Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #13 Posted: Tue Oct 06, 2015 6:55 am 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
HermanHiddema wrote:
I think it is appropriate to warn him of the risks and suggesting alternatives, rather than just throwing some information his way and washing your hands of it.


Then feel free to do so. I don't personally feel any such moral obligation.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #14 Posted: Tue Oct 06, 2015 6:59 am 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
Mike Novack wrote:

b) It is not just courtesy. Those whose database it is have a perfect right to consider something bringing down their database an attack.


Then the db owners should take those precautions. Nobody has "attacked" anyone here, and knowing about tools like wget and curl is quite useful.

_________________
be immersed


This post by Kirby was liked by: Jujube
Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #15 Posted: Tue Oct 06, 2015 7:26 am 
Gosei
User avatar

Posts: 2011
Location: Groningen, NL
Liked others: 202
Was liked: 1087
Rank: Dutch 4D
GD Posts: 645
Universal go server handle: herminator
Kirby wrote:
...we're all friends here in the go community...

Kirby wrote:
I don't personally feel any such moral obligation.


Moral obligation is a strong term, but IMO some effort to prevent an inexperienced developer from accidentally causing grief to others in the go community would certainly be friendly. As macelee's example shows, mass downloading causes real frustration and real work for fellow players.


This post by HermanHiddema was liked by: Bantari
Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #16 Posted: Tue Oct 06, 2015 9:03 am 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
HermanHiddema wrote:
Kirby wrote:
...we're all friends here in the go community...

Kirby wrote:
I don't personally feel any such moral obligation.


Moral obligation is a strong term, but IMO some effort to prevent an inexperienced developer from accidentally causing grief to others in the go community would certainly be friendly. As macelee's example shows, mass downloading causes real frustration and real work for fellow players.


I agree with that. Most of my pushback on this is because I thought that it was strong of macelee to say that he hated this discussion (which he kindly withdrew).

My comments were intended to be informative to the OP, and I'm not suggesting that he launch some sort of DoS attack on anybody.

So yes, I agree that it's friendly to work with website owners - it makes their job easier. But this shouldn't preclude people from discussing how to use common web technologies.

(As a side note, if mass downloading is something a website owner is concerned about, there are options available. CAPTCHA is one example. If you're going to make a website that scales, it's better to use technical means to protect yourself rather than relying on the goodwill of all users worldwide.)

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #17 Posted: Tue Oct 06, 2015 11:39 am 
Dies with sente

Posts: 96
Liked others: 0
Was liked: 14
macelee wrote:
I hate to see people attempting doing a mass download without consulting the owner of the service. At the moment I even hate to see people discussing it on this forum.

When you do this, things can go wrong. And often things will go wrong.

Last night at approximately 22:48 UK time, go4go.net was apparently brought down by a badly written script. My early analysis shows that the script sent 150 or so requests to my server within minutes to grab some data intensive pages, causing the server to run out of memory. The traffic was from a host in Amazon EC2's network 52.89.xxx.xxx (damn, I have to protect the privacy of whoever did this).


It's not that hard to rate limit packets coming in at that rate from the same IP. You could use something like fail2ban if you don't want to roll your own.

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #18 Posted: Tue Oct 06, 2015 1:58 pm 
Beginner

Posts: 13
Liked others: 0
Was liked: 0
Rank: KGS7d
Wow it seems like a hurricane passed by here..

I thank Kirby for introducing me some awesome programs and I also want to thank others for warning me of possible dangers; obviously I don't wanna cause anyone any trouble.

But yeah, after looking into these programs it became soon evident that I will need some help.

It'd be great to learn everything on my own but it's probably a bit too much.

So what's next? Should I look for someone to show me what to do? I'm willing to pay couple hundred bucks but since I have no idea how time consuming it is for the expert... Kinda lost where to begin looking for this "expert" as well.

There's university near my home so maybe I can find someone to help me there?

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #19 Posted: Tue Oct 06, 2015 3:02 pm 
Lives in gote

Posts: 653
Location: Austin, Texas, USA
Liked others: 54
Was liked: 216
Have you considered Go4Go or GoGoD? They sell their database for a pretty low price. Would save you the hassle of trying to scrape some other website.

http://gogodonline.co.uk/ 84383 games
http://www.go4go.net/go/delivery_service_faq 50315 games

Top
 Profile  
 
Offline
 Post subject: Re: Mass downloader??? Help!
Post #20 Posted: Wed Oct 07, 2015 3:00 pm 
Lives in gote

Posts: 677
Liked others: 6
Was liked: 31
KGS: 2d
I'd definitely support the idea to harvest 500.000 games or so from Tygem 9/8d's with a script that does not misuse or damage the server. Right now fuseki.info and http://ps.waltheri.net/ are the best available sites.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 28 posts ]  Go to page 1, 2  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group