by
Dean Lunz
14. April 2010 15:24
- Uploaded this weeks armory data
- Fixed an issue with the Date selector not working properly ( but dates are not carried across page views still working on fixing that )
- Added a members link on the guild stats page to view the list of guild members
I have a desktop app I made to crawl the armory from a master list of toon names I have been building. Last night I crawled a few thousand characters and everything went fine but about 5% came back as 503 Server errors. IE: The server denied my request. This is normal and expected based on passed experience. So this morning I used that same program to try and re-download all the toon names that came back a 503 errors. But this morning only a dozen or so characters were crawled before I started to get 503 errors for every single request after that.
I cant even use firefox to look up the characters anymore, and was locked out of the armory fer a few hours.
So obviously blizz has flow control built into the armory to help control the volume of requests that come in. Each character requires about 10-12 requests 1 character page, 9 or so statistic pages etc, multiply that by thousands of character names and a 50ms delay between each request and it takes a very very long time.
This is not to say that there are not ways around this and I am working on a system to distribute the crawling of the armory across any number of machines but that is still a long way off.
I just find it utterly confrusing to be able to make thousands of requests over night but then come morning, or what I assume are peak armory hours, I can hardly make a few dozen requests before being 503 denied.
Calculating and presenting character/guild/realm statistics is easy mode trying to get the character data from the armory is a real pain in the ass. :(
After some searching I came across this article about how the armory throttles requests. Gonna have to do more looking into how to over come that because it takes me 20+ Hrs already and it will take me 3 days if I delay each request by 1.5 secs. I've been tracking what characters return 503 errors but if I'm not carfull the armory will block me for a long time for making too many requests. I have implemented a auto throtlling system in the crawler app now so hopefully it will automatically throttle the number of requests per second that the app is making.