originally posted in:BungieNetPlatform
View Entire Topic
[url=http://jsbin.com/etezel/1]Interactive version[/url].
Please note: this is only a measure of how many topics were created per day in the public forums (including public group topics) from June 14th, 2012 to June 13th, 2013 according to UTC. I have trimmed June 14th 2013 off because it does not include all topics created for that "day". I have forgotten to do so for June 14 2012, however.
I would be hesitant to use this as a measure of activity because, as I mentioned in another topic, that's kind of difficult to define (not to mention traffic stats are impossible to obtain beyond what sites like Alexa provide, if you take that point of view).
More pretty graphs coming soon which I will post in this group (as there may be a lot of them). Suggestions of what you'd be interested in seeing are very much welcome, but I can't promise anything. I was thinking of overlaying multiple #tag frequencies on the same graph, so you could see things like #OffTopic vs #Flood (and thus, which one you should probably use), #PS4 vs #XboxOne, #Destiny, etc... again, suggestions.
Those of you interested in the nitty-gritty stuff of how I got this might like to read the following:[spoiler]
I used cURL with PHP to grab the data from the BungieNetPlatform (GetTopicsPaged) without a tag string and iterated from page 0 until it reported there not being any more pages. You can check out my log file [url=http://pastebin.com/d9dCV08F]here[/url] for the requests. To keep network IO low, I requested gzipped responses and decompressed after retrieval. All up it was ~70MB of HTTP (inc headers) which decompressed to ~311MB (without headers), which is great. Also interesting to look at were the response times. The first 500 were roughly under a second, but then increased and hovered at around 2-3 and peaked at 3-4 later and eventually dropped fairly quickly back to 1-2 toward the tail-end. I started the crawl at around 2AM PDT, so it makes me wonder if I was tripping some rate limits for the connection/address, or if there is purposely a decrease in bandwidth during those times in the night (which might help explain the decrease in response time as the morning progressed). Then again, the increased times might be the result of a stale cache update to a slave; the initial responses were fast because the data is already present from other users' requests, as opposed to the requests for data later on (who's going to look at page 4000?).
[/spoiler]
-
[i]We shall rise[/i] [spoiler]Thanks for this chart, I love statistics for things I'm a part of.[/spoiler]