Frequently Asked Questions
Where do you get your data?
For many analyses, I use the official Blizzard API. The major challenge with the Blizzard API is that there is no native way of doing random sampling. This is the primary reason why most sites only report leaderboard data. The way I have gotten around this is to construct a random sample of characters across EU and US realms that I track. In order to obtain this sample, I have identified a few thousand characters that I use as seed characters. I then pull the guild roster of these seed characters, and all the level 60 characters on the roster go into the sample. The leaderboard data is simply pulled from the Blizzard API. This method undersamples people who are not in a guild, but it is to my knowledge the best (and only) way of pseudorandomly sampling the character base.
For arena and RBG match data, the data is gathered through user-submitted REFlex data (see REFlexology). REFlex is a commonly used PvP addon that is used to track your games. In order to get the top 2v2/3v3 comps, I look at the opponent data rather than the user data. This gives samples of the true comp distribution at the given rating. The rating distribution itself is of course not the same as that of the general population, but this isn't really necessary in order to say something about which comps are most commonly played at above 1800 or 2100 rating.
Is there any way to validate your data?
The ladder data reconciles very well with other estimates, such as that provided by Arenamate for R1. The Arenamate R1 estimates essentially always fall within the confidence intervals for our R1 estimate, which is strong evidence that the sampled distribution is appropriate. Note that due to the fact that we're sampling the ladder, there is a great deal of uncertainty about the 99.9th percentile when estimated from a sample of the population. This is simply due to the fact that we don't have a lot of samples that are in the 99.9th percentile. If you're a R1 player, you should keep using special purpose tools for this (like Arenamate).
Is the median rating really 1350-1400?
Yes. There is ample evidence that the rating distribution is very deflated compared to last season, especially in 2v2. 3v3 and RBGs have more inflated numbers.