Balance in the System, Part II

Greetings Rokugani!

Weighing balance is no mean feat.

It’s time to return to our discussion of game balance, or what I call “faction balance.” It’s something that we all desire for L5R, and really for any game where players have non-identical pieces / abilities. And it’s a hot topic of discussion anywhere that L5R players are, but most of those conversations start with one player boldly proclaiming that the design team is composed of fools who don’t have the common sense God gave a screw driver. This is usually followed up by somebody pointing out the extreme challenges inherent to creating nine distinct factions who are pursuing 4 different victory conditions, and keeping that all balanced. But I rarely see the people in these conversations attempt to really grapple with the question of how balanced L5R really is beyond the occasional anecdote, or vague “feelings” of balance.

This article is the second in a series that attempts to assess and measure the balance between the factions of L5R in the current environment as well as in previous years. My first article (which you can read here) included a lengthy discussion about what faction balance is, how it can be measured, and how we can infer it from the imperfect data sets that we have – Kotei results. As a brief reminder, I ended up looking at how often a specific clan was able to make it past the cut as normalized by how well represented that clan was in the tournament. In an ideal world, if 20% of attendance at a tournament was made up one clan, then 20% of the decks that made it past the cut should also come from that clan. This normalization process gets us a ratio of odds. In the dream scenario, each clan’s ration would be 1. The further from 1 a clan goes (either up or down), the less balanced it is. Numbers below 1 indicate that the clan has a hard time getting past the cut (an “underpowered” clan), and numbers above 1 indicate that the clan is making the cut more often than they should (an “overpowered” clan).

Consult your doctor about whether Skipping Ahead is right for you!

[***Note – if you want to avoid my somewhat lengthy and pedantic discussion of my methodology and wonky data sets, you should skip down to the Turning Back the Clock section to find out how balanced previous Kotei seasons proved to be. You can also skip to the Wrap-up section for the final summary.***]

So far, that’s all math, but the next and perhaps most crucial part of the process is tricky. The question is how close to 1 does a clan’s ratio of odds need to be in order for us to consider that clan to be “balanced”? Below are the categories that I introduced in my last article which each clan will ultimately fall into. I’ve also included a point scoring mechanism which I will use as a way to measure how balanced an environment as a whole is.

  1. Any clan within 10% of the ideal (0.9-1.1) is extremely well-balanced, and I expect that most of the variation from the ideal is due to chance, rather than balance issues. Ideally, this is where all clans would be. I will award the Design Team 2 points for each clan that makes it into this category.
  2. Any clan 11 – 20% of the ideal (0.8-0.9 and 1.1-1.2) is acceptably balanced. Chance and card power play nearly equal roles in the variation from the ideal. Due to the difficulties in balancing so many factions, this is the category I would expect most clans in a well-balanced environment to fall into. I will award the Design Team 1 point for each clan that makes it into this category.
  3. Any clan 21 – 35% of the ideal (0.65-0.8 and 1.2-1.35) is not balanced. Card power plays a significant role in the variation from the ideal experienced by these clans. I don’t want any clans to make it into this category, but I do realize that even with good design, clans will occasionally slip into this category. The first clan in this category will net the Design Team 0 points, but for each subsequent clan, I will subtract 1 point.
  4. Any clan that is outside of 35% from the ideal is a problem – it either represents a single clan that wins so often they are format-defining, or a clan that is so badly hampered by their poor card pool that they are effectively unable to compete. As this category represents a major detriment to the game, I will subtract 3 points from the Design Team for each of these clans.


Using this point scoring system means that the Design Team could theoretically score 18 points (2 for each clan), but that would truly be a miracle, and I certainly don’t expect them to put every clan in the bulls-eye, as it were. A more reasonable goal, I think, is to have all the clans fit into the “acceptably balanced” category. So with that in mind, I will score how well the Design Team has done out of a high score of 9 points. Finally, you may recall that when I looked at the composite 2015 Spring Kotei Season data, I found that 7 of the 9 clans were rated as “acceptably balanced” or better, and that the Design Team earned 10 points in total.

Alright, so now that we all know the methodology, the question is what will I be applying it to today? Well, there are really two things that I want to examine. The first is how balanced the game has been in the past (using Kotei 2011-2014 results). And the second is how balanced the current Twenty Festivals Arc format is. I’m going to hold off on looking at the 20F Arc format for now mostly because once Kotei season ends, the data set will be complete, and I can compile the largest possible data set. Plus I’d rather not write one article now and a completely redundant article in a few weeks time. So if you’re looking for me to talk about Twenty Festivals, I’m afraid you’ll have to wait for Part III.

Well, at least we’ll look at one of these!

That means today will be a trip into the past, seeking to measure how balanced the game was during previous Kotei seasons. Before I begin, I do want to include a disclaimer regarding the data (which you can find here). I will be using the entire set of Kotei reports for a single year, which is good because it gives us a nice big data set to work with, but is also bad because it means that our data set includes multiple different formats. There were times that an expansion was released in the middle of the season, flooding the environment with new cards. Far more frequently there were bannings and erratas which dramatically changed the environment by crimping dominant deck types.  Each of these events create dramatically different card environments.

So why is having multiple formats in a data set a problem? Well, data sets including multiple distinct formats are much more likely to have a balanced result. One format may help make up for the deficiencies of another. As an example, if I lumped all of the 2011-2015 Kotei data together, I could make the case that the game is almost perfectly balanced (except for poor Spider who would still only make the cut 76% as often as they should). The next question, then, is why on earth would I include multiple formats together? Well, it comes down to two factors. First, I don’t know how to precisely define an “environment.” Do I consider the before- and after- errata tournaments as separate environments in all cases? For example, is it really worth considering 20F Arc events before the technical errata to Final Sacrifice as different from the events afterward? Personally, I would say no, but that’s because I know what the card does, and I’ve seen the environment before and after the errata and I think they’re the same. But I don’t have this “insider information” for previous years. The second point is that I simply don’t know which environment an event was played in. I could look it up, but I honestly don’t think it worth the several days I think it would take. If anybody else wants to tackle this problem, though, I’d be happy to offer statistical analysis….

One more final wrinkle (I promise, this is the last!) is that for years that feature unaligned strongholds (2011, 2012, and 2013), I won’t be considering them.  They were never a faction in the same way as the nine clans, and they dramatically underperformed in every year they existed. They make the Spider look positively overpowered! In addition, their inclusion would make it difficult to compare years with the unaligned faction and without. So they’re being left by the wayside (sorry, ronin!).

 

Turning Back the Clock:

Let’s return to 2011 – the second year of Celestial Edition. This Kotei season started shortly after the release of The Dead of Winter, and is perhaps best remembered by Spider players who possessed the powerful Spider Breeder deck, which continued to do well despite the errata. Let’s take a look at the numbers:

2011 Kotei Season Attendance % Made Cut % Odds Ratio Points
Crab 406 11.54 71 13.35 1.16 +1
Crane 450 12.79 73 13.72 1.07 +2
Dragon 337 9.58 42 7.89 0.82 +1
Lion 360 10.23 70 13.16 1.29 +0
Mantis 329 9.35 49 9.21 0.99 +2
Phoenix 367 10.43 48 9.02 0.87 +1
Scorpion 453 12.87 55 10.34 0.80 +1
Spider 422 11.99 73 13.72 1.14 +1
Unicorn 334 9.49 50 9.40 0.99 +2
Total: 3519 532 +11

My first observation when looking at these data is that no clan fits into my unacceptable category – no clan deforms the environment, and no clan was unable to effectively compete. This is a fantastic way to start off! Indeed, only Lion slip into the not balanced category – the other 8 clans are all “acceptably balanced” (Crab, Dragon, Phoenix, Scorpion, and Spider), or “extremely well balanced” (Crane, Mantis, and Unicorn). The 2011 Design Team is awarded 11 total points with none taken away. I award this Design Team an A as these numbers indicate that 2011 was an extremely well balanced environment and a model for future design.


The next stop on our train through time is 2012 – the season that started just after the release of the controversial Emperor Edition. It was a year that saw the Crab continue to kick lots of butt, the flipping of the Phoenix from a marginal power to a significant one, and the Spider getting punished for their previous Breeder deck.

2012 Kotei Season Attendance % Made Cut % Odds Ratio Points
Crab 512 13.31 97 16.39 1.23 +0
Crane 325 8.45 34 5.74 0.68 -1
Dragon 414 10.76 67 11.32 1.05 +2
Lion 456 11.85 85 14.36 1.21 -1
Mantis 414 10.76 61 10.30 0.96 +2
Phoenix 403 10.47 61 12.84 1.23 -1
Scorpion 503 13.07 76 12.84 0.98 +2
Spider 339 8.81 33 5.57 0.63 -3
Unicorn 377 9.80 58 9.80 1.00 +2
Total: 3848 592 +2

For the first time, we have a clan slide into the unacceptable category. Spider’s fall from grace was a hard one, and that mistake costs the 2012 Design Team 3 points. Tragically, many other clans (Crab, Crane, Lion, and Phoenix) aren’t really all that much better balanced – especially the Crane who are teetering on the edge of unplayability. This costs the Design Team another 3 points. The last four clans (Dragon, Mantis, Scorpion, and Unicorn) actually make it into the “extremely well balanced” category, which brings the Design Team’s total score up to 2 points.  I give this Design Team a D. If you are willing to overlook Spider, the season isn’t horrible, but it’s not that great either.

 

Fast forwarding a year to 2013 keeps us in Emperor edition, but adds cards from 5 expansions, the latest one being Torn Asunder. It was a season thoroughly dominated by the Mantis (why does THAT seem familiar?) and Phoenix Clans. Let’s look at the numbers:

2013 Kotei Season Attendance % Made Cut % Odds Ratio Points
Crab 326 11.37 58 10.41 0.92 +2
Crane 244 8.51 31 5.57 0.65 +0
Dragon 285 9.94 44 7.90 0.79 -1
Lion 323 11.26 46 8.26 0.73 -1
Mantis 348 12.13 93 16.70 1.38 -3
Phoenix 356 12.41 87 15.62 1.26 -1
Scorpion 317 11.05 61 10.95 0.99 +2
Spider 221 7.71 29 5.21 0.68 -1
Unicorn 264 9.21 50 9.52 1.03 +2
Total: 2868 557 -1

The Mantis domination is complete enough to cause them to slip into the unacceptable category, costing the 2012 Design Team 3 points. While they are the only clan in this category, there are several hovering very close to it, including the Crane and Spider (both of whom suffer for the second season in a row), Dragon and Lion (both down from their high the previous year), and Phoenix (who continues their potency from the year before). 5 clans in the “not balanced” category cause the Design Team to lose another 4 points. The last three clans (Crab, Scorpion, and Unicorn) all make it into the “extremely well-balanced” category, and so gain the Design Team 6 points, leaving them with a total of -1. This downward trend is not reassuring. I’ve got to give the 2013 Design Team an F. They managed to balance the Crab (a feat they failed at during the last two seasons), and they kept the Scorpion and Unicorn at their previous balance, but they made every other clan less balanced. I suspect this is where many L5R players got their belief that expansions only help powerful clans.

 

2014 was a tumultuous time for L5R – the game was just coming out of the reviled Emperor Edition and the company decided to both power down the game, and make significant rules changes in the hopes of attracting new players at the same time. It was a year dominated so heavily by Crane that it required multiple significant errata which mostly just allowed Unicorn to dominate the later half of the season. So how do those numbers look?

2014 Kotei Season Attendance % Made Cut % Odds Ratio Points
Crab 226 9.16 25 5.84 0.64 -3
Crane 379 15.36 104 24.30 1.58 -3
Dragon 275 11.15 44 10.28 0.92 +2
Lion 287 11.63 38 8.88 0.76 +0
Mantis 240 9.73 36 8.41 0.86 +1
Phoenix 231 9.36 23 5.37 0.57 -3
Scorpion 301 12.20 60 14.02 1.15 +1
Spider 226 9.16 22 5.14 0.56 -3
Unicorn 302 12.24 76 17.76 1.45 -3
Total: 2467 428 -11

Oh dear. This is clearly problematic. Crab, Phoenix, and Spider are all unacceptable – they are so weak that these clans are effectively unable to compete meaningfully. But wait, that’s not all! Two more clans (Crane and Unicorn) are so strong that they are format warping. That’s 5 out of 9 clans that fit into the unacceptable category. That’s -15 points for the 2014 Design Team. Not a good place to start. I’ll make this short – Lion is “not balanced”, Mantis and Scorpion are “acceptably balanced” and Dragon stand alone as the only “extremely well balanced” clan. That’s a total of -11 points for 2014. I don’t think there is a grade that encompasses how poorly balanced this year was.

The Big Wrap-up:

Wow, that was a lot of data. If you are still with me, gentle reader, you get an A for fortitude! (For those who skipped down to here, trust me, the stuff above is good, but dense.) It’s finally time to get the big payoff from looking at all that – let’s compare the various Kotei Seasons against each other. To help in this analysis, I’ll include each year’s point total, letter grade, and the number of clans who made it into each category:

Year 2011 2012 2013 2014 2015
Points 11 2 -1 -11 10
Grade A D F F- A
Well Balanced Clans
3 4 3 1 6
Balanced Clans 5 0 0 2 1
Not Balanced Clans 1 4 5 1 1
Unacceptable Clans
0 1 1 5 1

Those… are some dramatic differences. First, it is clear that just last year, the game was not in a good place design-wise. It was slowly slipping into less and less balanced territory, and Ivory did the game no favors. The Design Team fared poorly at balancing the new powered-down environment they created. That having been said, this year is clearly a triumph in game balance, first because it was able to reverse a very bad trend, and second because it is the most balanced Kotei season since 2011. While there are certainly still problems with the current environment (namely Mantis and Spider), I honestly think the 2015 Design Team should be lauded for this accomplishment. These sorts of actions are exactly what the game needs if it is to survive and thrive, and it shows that AEG can learn from their mistakes, and make changes for the better.

Balance is for clans without pirates, arrr!

Before I sign off, I did want to point out a few other observations that I had about individual clans over the years. First, while many clans have gone through high and low periods, none have gone through droughts and floods like the Crane – who jumped from a 0.65 odds ratio in 2013 to 1.58 in 2014. That’s a difference of 0.93, meaning that the difference would be considered extremely well balanced if it was a clan!

The Mantis and Unicorn are interesting because they appear to only have two modes of action – well balanced, or insanely broken. Unicorn is the lesser offender of this, having had 4 balanced years, and 1 unacceptably good year. Mantis has had 3 balanced years and 2 unacceptably good years. They have fallen into the unacceptable category more often than any clan except Spider (who gets into the category a different way…). Turns out that themes that include free gold and cheap personalities are difficult (but possible!) to balance – who knew?

The Dragon and Spider have both been consistently underpowered. Spider, obviously, needs the most attention. They’ve been unacceptably weak for 2 years, not balanced (on the weak side) for 2 years, and finally not balanced (on the strong side) for 1 year. While that is certainly the more dramatic situation, the Dragon find themselves in a similar predicament – they’ve only had an odds ratio higher than 1 for a single year, and even then, they only got to 1.05!  Admittedly, they got a lot of wins that year, but their odds ratio is really nothing to write home about (unless one is writing about excellent balance). They usually hover around the low end of the “acceptably balanced” category, and while that is ok, I’m sure Dragon players would like at least one year where they break into the powerful side of balanced.

Paragon of fair play? Who would have guessed?!

Finally, it is ironic to me that the Scorpions are the scions of balance. Their balance was spot on for 3 out of 5 years, and pretty darn close to that in the other two years. They seem to be struggling to make the cut a little in the 20 Festivals environment, but still, they are the closest thing we have to a clan who has remained well balanced for all 5 years that I have data on!

It remains to be seen if this year’s extremely good balance is a fluke, or a sign of better things to come. Stay tuned for my final article in this series where I take a close look at the 20 Festivals Arc environment and see exactly how balanced the game really is! In the meantime, go to Kotei (say Hi to me if you go to Sacramento!), support your local gaming store, and enjoy all of your causal games of L5R!

3 thoughts on “Balance in the System, Part II

  1. Fascinating article. Did you observe any correlation between low honor clans and poor balance? Not sure based on the naked eye, but I know I see this argument on the forums.

  2. Hi Charles – that’s a great question. I assume that you are interested in seeing if low-honor clans are weaker than high-honor clans (as opposed to just less balanced). I don’t really see evidence of that (with the major exception of Spider). Here are the average odds ratio for clans sorted by honor:

    Lion – 0.99
    Crane – 1.03
    Phoenix – 1.04
    Dragon – 0.91
    Unicorn – 1.11
    Crab – 1.02
    Mantis – 1.14
    Scorpion – 0.95
    Spider – 0.77

    So the data doesn’t suggest that low honor clans are unable to effectively compete. In fact, the clans that have done the best (Unicorn and Mantis) are low to medium honor clans. I hope that helps answer the question!

  3. Thanks for the reply! I was thinking of balance because I have seen the argument “Design doesn’t know how to balance low honor clans. They are either broken or noncompetitive.” Maybe that applies to Spider, but not sure from the data it applies to any other low honor clans. I guess all things being equal you would expect low honor clans to be broken or noncompetitive the same as high honor clans. We’re only talking about five seasons, so it would probably be hard to tell anything from the data, but was just wondering.

Leave a Reply to MauzizCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.