Wednesday, June 20, 2012

The effectiveness of practice - a small study of StarCraft 2 balance

Edit 2012-06-21: Corrected "games" to "won games". Sorry about that!

Win rates have little to do with "balance"

I had some musings about StarCraft 2 balance a while ago and decided to write a program to check them out.

Basically, most discussions about balance end up mentioning win rates a lot. However, because the StarCraft 2 matchmaking tries to guarantee each player has a win rate around 50% in the middle-long term, win rates are actually quite a meaningless indicator of player strength, and hence balance. Regardless of balance, we're always expecting them to end up around 50%.

Imagine you take a player of grandmaster level, and now handicap him by disallowing the use of his races' macro mechanic. The hit to his strength is of a similar nature as would be when one of the races were weaker than the others. After playing for a while (and initially scoring way below 50%), he will likely drop down to somewhere like diamond league, for example, until his opponents are such that he gets a 50% win rate again. So we now know that this player is actually grandmaster, but his race handicap holds him back from realizing his potential. Yet his win rate is the same as always.

Win rates also suffer from temporal fluctuations if effective strategies for once race are developed and it takes a while to find a counter for the other races. Win rates can suddenly shift quite far away from 50%. The actual ability of the players did not change, nor did the game itself suddenly become imbalanced. This would only be the case if it eventually turns out that there is no counter to this strategy.

There is one particular case in which win rates can tell us something about balance issues: Random players. For Random players who play all races equally, we'd expect them to on average score the same with all races. If we take the data of all random players and a certain race appears to outscore the others, it's a good sign of a balance issue.

Effort needed for reaching a league

Unfortunately, gathering statistics of the race performance of Random players is something that seems hard to spider off of Battle.Net. So while Blizzard can and probably does use this information in their balancing discussions, it is not available to us.

Another way to look at race strength is to consider that if we could pick players of similar "innate" skill, give them a race and let them loose on the SC2 ladder, we could check if they end up in the same leagues.

So how do we find these players of similar "real" skill? We can make an assumption: a player improves the more StarCraft 2 games he plays. This isn't a very far fetched conclusion, as its known that most people improve at skilled tasks though practice. We can then rephrase our balance question: for a given amount of practice, will a player of a certain race end up being higher ranked than players of another race?

The data

I spidered data from about 26676 players from the EU StarCraft 2 servers, or about 10% of the active player population in that region. I rejected all players that have no games in Season 8 yet, and all Random players. They're not pertinent to our base question, and they require extra effort to distinguish from each other. I also threw away all players with less than 5 games, which hence have no league yet.

The table below shows the average amount of won games for players in a certain league, both overall, and broken down per race. After the number of games is the 95% standard error on the mean, which essentially gives some idea how accurate the numbers are. (They're mostly accurate to about 5-10%, except for the grandmasters due to extremely low sample size there). The next number is the amount of won games below which 95% of the players in that league are. This gives some idea of the amount of games where a player is extremely likely to get to the next league soon, as well as the actual variance of the amount of games players in a league have. Note that the variance is actually quite big - there's quite some players who have played twice the average amount of games yet are still stuck in a league.

The average number of won games the players in a league have is not the same as the amount of games you expect to need on average to be promoted into that league. That point is (very approximately) somewhere halfway between the average of the lower and the target league. The total number of games is about twice the number of won games - because the matchmaking should keep you around 50%.

So with this introduction, here is the actual data:

    EU Server, 26676 players sampled, 16243 valid samples
    
    6010 players in Protoss (37%)
    5248 players in Terran  (32%)
    4985 players in Zerg    (31%)
    
    Avg wins =   290 (+-    9), 95% upper ( 1021) for  5754 players in bronze     (35%)
    Avg wins =   533 (+-   17), 95% upper ( 1547) for  3468 players in silver     (21%)
    Avg wins =   708 (+-   22), 95% upper ( 1902) for  2715 players in gold       (17%)
    Avg wins =   939 (+-   33), 95% upper ( 2485) for  2133 players in platinum   (13%)
    Avg wins =  1239 (+-   48), 95% upper ( 3047) for  1386 players in diamond     (9%)
    Avg wins =  1680 (+-   86), 95% upper ( 4101) for   783 players in master      (5%)
    Avg wins =  2880 (+- 1544), 95% upper ( 5968) for     4 players in grandmaster (0%)
    
    Avg wins =   293 (+-   15), 95% upper ( 1022) for  2248 players in bronze_Protoss
    Avg wins =   305 (+-   16), 95% upper ( 1079) for  2163 players in bronze_Terran
    Avg wins =   262 (+-   17), 95% upper (  917) for  1343 players in bronze_Zerg
    
    Avg wins =   511 (+-   26), 95% upper ( 1467) for  1269 players in silver_Protoss
    Avg wins =   553 (+-   33), 95% upper ( 1681) for  1162 players in silver_Terran
    Avg wins =   537 (+-   29), 95% upper ( 1483) for  1037 players in silver_Zerg
    
    Avg wins =   687 (+-   34), 95% upper ( 1767) for   977 players in gold_Protoss
    Avg wins =   750 (+-   48), 95% upper ( 2115) for   777 players in gold_Terran
    Avg wins =   695 (+-   37), 95% upper ( 1848) for   961 players in gold_Zerg
    
    Avg wins =   949 (+-   55), 95% upper ( 2480) for   762 players in platinum_Protoss
    Avg wins =   886 (+-   59), 95% upper ( 2292) for   550 players in platinum_Terran
    Avg wins =   966 (+-   57), 95% upper ( 2608) for   821 players in platinum_Zerg
    
    Avg wins =  1217 (+-   77), 95% upper ( 2902) for   476 players in diamond_Protoss
    Avg wins =  1188 (+-   97), 95% upper ( 3065) for   371 players in diamond_Terran
    Avg wins =  1293 (+-   80), 95% upper ( 3156) for   539 players in diamond_Zerg
    
    Avg wins =  1719 (+-  135), 95% upper ( 3974) for   276 players in master_Protoss
    Avg wins =  1687 (+-  177), 95% upper ( 4349) for   225 players in master_Terran
    Avg wins =  1635 (+-  141), 95% upper ( 4017) for   282 players in master_Zerg
    
    Avg wins =  2867 (+- 3537), 95% upper ( 7870) for     2 players in grandmaster_Protoss
    Avg wins =  2893 (+- 1336), 95% upper ( 4782) for     2 players in grandmaster_Zerg

Conclusions

There is a clear correlation between having played more games, and being in a higher league. This validates our earlier assumption. More practice makes you a better player. (Or if you're pedantic about causation-correlation conclusions, at the very least better players practise more.) I know this sounds extremely obvious but it is nice to validate it. It also indicates smurfing etc isn't prevalent enough to mess with our results.

There are no significant differences between the races per league [1]. This means that you cannot get higher ranked faster by picking a specific race. You still need to put in the same amount of practice. This is the main thing we wanted to investigate and it's good news: StarCraft 2 is *really* well balanced, or at least it has been over the last 8 seasons, which this data aggregates.

If you want to get to masters, expect to put in about 3000 practice games, and maybe as much as 6000. If we assume an average game (including postmortem analysis etc) takes 20 real-life minutes, expect to put in about 1000 hours of practice.

If you played 2000 games and are still in bronze league, you're doing it wrong.

[1] This conclusion assumes that the majority of players play one race most of the time, i.e. that offracing only happens occasionally. Without this assumption we can't put the players in race groups to begin with. Note that if offracing were the norm, and the races not balanced, the offracing players would be able to detect this quickly, which in turn would made it less likely that they'd continue to offrace instead of sticking with the strongest race.