
Where are the multi-armed bandits in games? I’ve asked myself this question repeatedly because the use cases are vast, and other tech companies have done most of the analytical and product-heavy lifting. While they exist in certain quadrants, they’ve certainly failed to scale. If we consider, for instance, the proportion of active mobile gaming users in a multi-armed bandit, it’d be hard to imagine that it has grown to any significant portion over the last five to ten years.
This isn’t super secret sauce either; the Superscale guys were singing the models’ praises years ago.
The idea behind the bandits is to identify the high-return slot machine by testing different machine pulls and then adjusting future bets to double down on winners and cut losers. This becomes “experimentation as a service” since it’s easy to add new “slot machines” to the pool and see if they emerge as “winners.”
Modelers can also set the exposure arm, i.e., how much sample the new slot gets to see if it’s a winner. It’s always interesting to see what the “open-minded” individuals put down for value here, since this amounts to an R&D budget.
Practically, seeds are a great example from match3: test a shit of variants, and select winners. This is exactly what the prior, public experiment in Bejeweled did (Dynamic Difficulty Adjustment for Maximized Engagement in Digital Games). Simply randomizing the starting board state on a single level drove as much as an 80% difference in attempts per success. Without seed stability, level designers are balancing against potentially billions of combinations. Practically, imagine trying to balance a foot race where each participant starts from a random point on the starting line.
The REALLY cool thing about what Metica is building is the allowance of multiple equilibria rather than single-winner convergence. In the seed example above, multiple seeds could emerge as winners. For example, some players may prefer (as measured by retention) more difficult or easier level seeds across the entire player lifecycle. Remember, research from Apex Legends presented at GDC highlighted how becoming too easy also leads to churn.
In a match, churn from too easy a difficulty is harder to observe than churn from difficulty. Hard difficulty churn is obvious and literally sticks out on dashboards, while easier difficulty is a subtle poison. The best solution is to add new variants, where instead of levels being measured between one another, they are measured as sequences. Think of it as evaluating the entire roller coaster rather than a linear subsection.