Abstract
Randomness very rarely plays a large part in decision making. In games like chess, for example, there is an objectively “right” choice, and any random choices put oneself in a worse position. Games that involve two players’ thoughts are different. In this study, we use several models to explain inconsistencies in Cognitive Hierarchy Theory: for one example, we use a rock-paper-scissors machine learning algorithm designed to predict the k-level thinking of a player and modify it for different k-levels to assess different values of success. After taking about 100,000 trials for each algorithm, we then use an analysis of mixed strategy algorithms to find out how to mathematically depict chances of success and compare them to each other. The first part of this experiment affirms the ideas of Cognitive Hierarchy Theory, where increasing k-levels seem to increase the chance of winning linearly from 3% to 10%. The second part of the paper finds the boundaries where randomness can succeed in k-level thinking games. We hypothesize that in discrete adverse games, random choices can be a better alternative to convoluted thinking. This empirical proof of Cognitive Hierarchy Theory, through the various algorithms displayed, can have possible implications for smarter decision-making in cases of multiple parties, such as international relations.
Keywords: Cognitive Hierarchy Theory, Game Theory, Discrete Mathematics, Mathematics
Introduction
The Level-K framework, part of a branch of Game Theory called Cognitive Hierarchy Theory, assigns different levels of “thinking” to different thresholds of logic that each party uses. These levels, called k-levels, are assigned positive integer values (1, 2, 3, 4, and so on) and can hypothetically go on forever, as they are more like placeholders for various levels of thought. A higher k-level means a higher quality of thinking, and every k-level is based recursively, representing the rational decision one would make assuming everyone else is at the k-level below them. For example, a k-level of 0 would represent random decision-making, as nobody can think at a level below complete randomness1. This framework traditionally presents the logical conclusion that in a completely fair game–that is, without any luck involved in the rules of the game itself–the party thinking at the highest k-level should be the one that wins. As a result, the Level-K framework commonly shows up in any area where the goal is to predict human behavior, primarily in business. Stock traders, for example, not only look at earnings reports but place value on them depending on how they will affect the market. Additionally, Entrepreneurs must consider what their competitors may be planning when setting their prices. Simply knowing how this framework operates can affect people’s perception of others, giving Cognitive Hierarchy Theory potential applications in psychology. As an extension of this benefit, understanding where randomness can fit into the level-k framework can help bolster what rational decisions can be made given a completely unpredictable other party.
A Numerical Approach
One example of this k-level phenomenon is the Keynesian beauty contest, in which contestants are asked to guess a fraction of the average number that all other contestants guessed on a scale of one to one hundred, rounded up to the nearest integer. This fraction is ½ in most cases.2. We will assume that this is not a repeated game (that is, it only occurs once), but each party has an infinite amount of time to make their decision.
For the sake of clarity, we will use tables to demonstrate the levels of thinking, indicating the k-levels on the left side and writing down the rational decision for each k-level on the right. Following the reasoning of each successive k-level, we get
K-Level | Rational Decision |
0 | None. The party would choose a random number, either with some sentimental significance or completely random. |
1 | The average of all random, or level 0, guesses would be 50, so half of that would be 25. |
2 | The average of all level 1 guesses would be 25, so half of that would be 13 (rounded) |
3 | The average of all level 2 guesses would be 13, so half of that would be 7 (rounded) |
And so on. The equation for the respective K level would then follow a limit of
(1)
as enough thinking would result in the conclusion that the “best” guess is zero3.
A Categorical Approach
This approach only works for numerical values, though. For categorical values and decisions, such as kicking left or right on a penalty kick, we can represent these decisions as a reaction table. This is to assume a starting value for K=0, as there does not exist a numerical average to go off of. Creating a reaction table for the example of a goalkeeper and shooter (where the actions follow a left-to-right, top-to-bottom pattern, and the goalkeeper and shooter are assumed to know what the other is thinking) creates:
Shooter | Goalkeeper |
Shoot left (random) | Dive left (response) |
Shoot right | Dive right |
Shoot left | Dive left |
(this also assumes that “left” and “right” are something akin to stage left and stage right since the goalkeeper and shooter face each other)
And so on. This alternation of decisions means that the goalkeeper or scorer will only win if they choose a K level where K=(K-1)-2n, where n can be any real integer. An important conclusion to make from this is that the scorer does not need to have a higher k-level than the goalkeeper; the scorer just needs to land their k-level within a certain threshold to win. Following the same alternation of reactions, a single round of rock-paper-scissors can theoretically go on forever before the players decide as they will continue to weigh reactions to what they think their opponent’s k-level is at.
The following experiments that this paper outlines aim to determine exactly where randomness can fit into decision-making games. While completely random guesses can tend to throw opponents off, the soccer example shows that there exist boundaries for when random k-levels will work. Helping to define these boundaries would give insight into other areas of mathematics like probability. While the second part of this paper is very theoretical, it provides a flowchart of sorts for deciding what to do when the opposite party’s choices are unknown.
We will also use the idea of risk and loss functions to explore what to do when a random choice is not favorable. Once we discover where randomness resides within a specific game, we can then figure out how to choose which potential k-levels to act at.
For the example of rock-paper-scissors, we can define our loss function as
\noindent
Where is the player’s choice and is the opponent’s choice. The loss function then becomes
(2)
Where is a set cost depending on how much a specific round weighs in the overall win or loss. For this example, scissors = 0, rock = 1, and paper = 2 are the possible integer values for and . To find the calculated risk for a specific choice (for example, choosing rock), we can write the risk function as
(3)
Where is the probability of the corresponding outcome ci from the loss function given the other person’s choice, x. For the first equation, is the probability of winning or tying, as both are “favorable” outcomes (nothing is lost), and is the probability of losing (hence the cost of ). The first term becomes 0 as , so our final equation for selecting rock is
(4)
We will use this concept of loss functions being a function of probability to expand on the findings of our experiment. The first part of this paper will look at the direct results of each k-level algorithm for the rock-paper-scissors algorithm presented, serving as an empirical proof of the k-level framework. We do not have an exact model for each k-level in rock-paper scissors, but we have created different algorithms that get progressively deeper, effectively finding out their k-levels by running them against each other multiple times.
Results
Over 100,000 trials, each algorithm performed against the Markov Chain algorithm as follows:
Random vs Markov | Winning Choice vs Markov | Win-Stay, Lose-Switch vs Markov | Simple Algorithm vs Markov | NEAT vs Markov | |||||
Trial | # of wins | Trial | # of wins | Trial | # of wins | Trial | # of wins | Trial | # of wins |
1 | 33324 | 1 | 350 | 1 | 13 | 1 | 537 | 1 | 267 |
2 | 33327 | 2 | 346 | 2 | 3 | 2 | 523 | 2 | 376 |
3 | 33309 | 3 | 352 | 3 | 11 | 3 | 507 | 3 | 689 |
4 | 33398 | 4 | 329 | 4 | 24 | 4 | 519 | 4 | 1092 |
5 | 33285 | 5 | 340 | 5 | 9 | 5 | 549 | 5 | 10257 |
6 | 33329 | 6 | 396 | 6 | 10 | 6 | 470 | 6 | 15483 |
7 | 33256 | 7 | 335 | 7 | 13 | 7 | 498 | 7 | 20229 |
8 | 33142 | 8 | 327 | 8 | 12 | 8 | 508 | 8 | 25333 |
9 | 33321 | 9 | 353 | 9 | 16 | 9 | 496 | 9 | 33097 |
10 | 33460 | 10 | 326 | 10 | 11 | 10 | 522 | 10 | 33323 |
Below is also a results matrix of each algorithm. The results matrix simply represents if an algorithm would win against another one, so in creating the algorithm we rounded results to 0 or 100000 depending on whether an algorithm won more or lost more against the other.
Random | Winning Choice | Win-Stay, Lose-Switch | Simple Algorithm | Markov/ NEAT | |
Random | – | 33333 | |||
Winning Choice | 33333 | – | 100000 | 100000 | 100000 |
Win-Stay, Lose-Switch | 33333 | 0 | – | 100000 | 100000 |
Simple Algorithm | 33333 | 0 | 0 | – | 100000 |
Markov/ NEAT | 33333 | 0 | 0 | 0 | – |
In Table 3, the bots mentioned in the “Methods” section are pitted against the highest k-level algorithm, the Markov chain algorithm. The resulting numbers are the number of times that each algorithm won against the Markov chain algorithm. In Table 4, a results matrix simply shows the same results, but just in a way that describes, more broadly, whether an algorithm will win more than often against another one.
For context, the trial numbers are there to get an idea for the algorithm’s consistencies. As seen above, the simple algorithm and “winning choice” algorithms vary in the amount of wins they get per trial run. As a result, the algorithms were run a number of times and recorded until the difference in the number of wins between each trial was fewer than five wins, showing that the Markov chain’s calculated probabilities adjusted to the algorithms. This took exactly ten trials, which is why data and graphs end at ten trials. We also chose to only display the “winning choice,” ‘win-stay, lose-switch,” and simple algorithm’s results because the high number of wins from the other algorithms would affect how the data was displayed.
Discussion
Qualitative Approach to Randomness
In ranking the algorithms in terms of k-levels to evaluate the model’s exact accuracy, we will omit the completely random algorithm (level 0), as that is the goal of the study. Instead, the control groups of the remaining algorithms will be ranked broadly by how much they “think,” as these specific algorithms cannot have a specific k-level assigned to them due to there being an infinite amount of algorithms that could be used. For reference, the following order of bot performance is expected given the complexity of the algorithms (this “complexity” refers to the lines of code the algorithm takes up, essentially only meaning how much the algorithm “thinks” before making its decision):
- The “winning option” bot
- “Win-stay, lose-switch” strategy
- The simple algorithm that uses “suspicion” values to choose when to become random
- The NEAT/Markov Chain algorithms
The hypothesis set earlier in the paper holds up remarkably well. The results matrix in Table 4 aligns with what one would expect for each algorithm to perform against each other given their assigned levels of thinking. Additionally, each developing k-level bot wins progressively more against the Markov Chain algorithm, reflecting the k-level framework. As seen in Table 3, the Markov chain, NEAT, and simple algorithms all fall into place, performing relative to each other as expected, but the “winning choice” algorithm and the “win-stay, lose-switch” algorithm both perform opposite to how they are expected to, with the “winning choice” algorithm outperforming the “win-stay, lose-switch” algorithm. This could potentially be due to the nature of the “winning option” throwing off the Markov chain’s algorithm with its data. Although the “winning option” strategy is simple from a birds-eye view, this specific Markov chain evaluates based off of the other algorithm’s choices entirely, so to the Markov chain algorithm would view the “winning option” algorithm’s choices as more random than the “win-stay, lose-switch” strategy. Additionally, when placed against each other, the “win-stay, lose-switch” bot won against the “winning option” bot as expected from the order above (see Table 4), so the likelihood that the “winning option” bot happened to throw off the Markov chain’s algorithm is reasonable high. These algorithms were used for testing based on common strategies that people choose when going through a rock-paper-scissors game, and can be easily assessed on what should win independently from the results of the algorithms against each other4,5.
The only outliers that appear on this graph are the comparisons that involve the random bot, as this essentially reduces each game to a random chance between winning, tying, and losing. It seems that, in rock-paper-scissors specifically, the best k-level to aim for can be one where there are no patterns in one’s choices. In other words, when a player has no idea what to do, they can choose to be random instead of trying to assume the other player’s k-level and possibly guessing below.
Quantitative Approach to Randomness
It is beneficial to know that, in a rough situation, randomness can essentially level the playing field for a given game. There are limitations to this thought process, though. In situations where players are absolutely sure that they want to win (or a ⅓ win streak is not satisfactory enough), it can be helpful to know exactly what the player is risking when choosing a k-level over being random in their decisions. Most studies have used a mixed model for determining given choices6. However, there are still some nuances to differences between k-levels. So far, this article has gone over the extremes of the decisions one can make: the completely random choices and the most advanced ones possible. The k-level framework, however, serves purpose outside of randomness by analyzing the intermediate levels of thinking to guess which one will win. For this analysis, we can turn to the use of loss functions. The next section will be highly theoretical and symbolic, so the cost functions pointed out below could be substituted with actual probabilities and payouts, but for now we will use symbols.
As we’ve pointed out, random choices for what option the player picks reduce the probability of getting an unfavorable outcome to ⅓, making the overall risk function of a random choice ⅓C. This means that the probability p(c2|x) is actually a function of the k-level one has when choosing. The probability that the player loses is therefore based on the probability that their opponent’s k-level is below their own. Using the risk equation for choosing rock found earlier yields:
(5)
= C * (player’s k-level)/(number of k-levels below the player’s that beat the player’s k-level, excluding zero)
Note that this only works for games where one player goes against another. For games like the Keynesian beauty contest, the k-levels are evenly distributed, and it is impossible to guess what the ultimate average lands on.
The above equation accounts for how large the player’s k-level is and how many k-levels beat theirs below them to create a sort of “confidence interval” to say whether the player’s k-level is high enough, as in a regular game most players assume that their k-level is the highest out of anybody playing6. For example, a higher k-level will also have a higher amount of k-level choices below that k-level that beat that particular decision, lowering the risk. On the other hand, a lower k-level has fewer k-levels below that beat the player’s k-level and therefore a higher risk.
There is also a limitation that comes with using cost functions: there has to be something at stake. While the approach above can work for any game, modifications must be made in order to apply these equations for risk to specific scenarios as any zero-sum game leaves all of these equations to equal zero).
When Randomness Does Not Work
Firstly, it should be emphasized that a random approach to any game will not work as long as there exists a Nash equilibrium–that is, there exists an option that each player should choose every time, regardless of what their opponent chooses. For example, in Nick Case’s The Evolution Of Trust, a simulation that repeats the prisoner’s dilemma a certain number of times, the “random” option never prospers across multiple generations7. This is because random choices in a Nash equilibrium game will only work if the player happens to choose the option that agrees with the Nash equilibrium. Otherwise, the random strategy will always fail.
This is a very important distinction to make. With regards to whether randomness can work, games against other people fall into one of two categories: ones where the optimal decision revolves around how much the other player is thinking and ones where they do not. For example, in the aforementioned Keynesian beauty contest, one must submit their answer based on what they think the other players will answer. Their decision is inherently based on what other people are thinking, thus putting the Keynesian beauty contest into the first category of games. On the other hand, in chess, one’s decisions can be completely independent of what the other player is actively planning out: a perfect chess bot, for example, thinks of an “optimal move” that awards it the best positional advantage8. Because one can choose an optimal move without considering what the other player is thinking, a game like chess fits into the second category. In games of the second category, the k-level paradox investigated in this paper will never work. “Random” moves will never beat a grandmaster in chess: they might throw them off for a second, but the grandmaster will ultimately punish these moves. On the other hand, saying “I will choose rock” before playing rock-paper-scissors can drastically affect how the other person thinks about the coming round by introducing randomness. The game environment is thus the sole factor in determining when randomness works: if the game’s decisions inherently depend on what the other player is thinking, random moves can work, because k-levels come into play when the other player’s thoughts come into the picture.
Methods
For the code for both the algorithms and the battling program used for each algorithm, see Appendix A.
The overall form of this study is to set up the algorithms and observe whether or not they agree with the theoretical predictions provided by the k-level framework. To collect data and compare different k-level boundaries, we made five different rock-paper-scissors bots, all in increasing levels of k-levels:
- Completely random choice between rock, paper, and scissors,
- Choosing whatever option beats the opponent’s last choice regardless of the result (“winning option” bot),
- The “win-stay, lose-switch” strategy, where the player changes their option to the one that beats their opponent’s last choice if they lose and stays at their previous choice if they win,
- A simple algorithm that, if its streak is too high or its inherent “suspicion” value is too high, changes to a random choice, and
- Two levels of the same complexity, which are a NEAT algorithm (with three input nodes and three output nodes, with no hidden nodes) and a Markov Chain Based algorithm.
Our flow of data collection is as follows:
- Create a program to set up each algorithm against the Markov Chain algorithm.
- Run said program 100,000 times, printing the end result of how many rounds each algorithm won.
- Create respective Excel graphs to see the trends of each algorithm’s win rate and performance relative to each other (Tables 3 and 4).
As a result, the amount of wins that each algorithm has against the theoretical highest k-level algorithm (Markov chain) will serve as a proof of cognitive hierarchy framework, and that it applies to the game we choose to experiment with (rock-paper-scissors). Then, with that framework established, we can run the random bot against the algorithms to prove our null hypothesis in the introduction (that the random algorithm will perform the same against all other algorithms).
Conclusion
As shown through the empirical proof of the four algorithms tested against random choices in rock-paper-scissors, the performance of the random algorithm proves that at a certain number of k-levels below what the opponent is choosing, thinking less can lead to greater results. This was evident in the analysis of the cost functions (see introduction and discussion) as well as the results of the random algorithm’s performance and expected outcomes (for expected outcomes, see the general ideas presented in the introduction). More generally, this study demonstrates examples where randomness can work as a strategy. In games where the optimal move does not depend on what the opponent is thinking, randomness will never work as the optimal move will beat one that is less predictable anyways. Only in specific games that fall into the k-level classification of Cognitive Hierarchy theory can randomness work, which helps define more strategies in the field of game theory. Of course, since Cognitive Hierarchy theory works more in broader, theoretical strokes: that is, a game like rock-paper-scissors cannot have distinct k-levels assigned to specific decision areas like the Keynesian beauty contest (see introduction “reaction tables”). As a result, more findings into how these types of games that use “reaction tables” and how one can sort thinking into k-levels would provide more useful for future studies.
Appendix A: Code for Each Rock-Paper-Scissors AI
For the full code, see: https://replit.com/@williamghlaws/code-for-rock-paper-scissors-research#main.py
References
- D. O. Stahl. Evolution of Smartn Players. Games and Economic Behavior. 5(4), 604-617 (1993). [↩]
- L. Husted. Game theory challenge: Can you predict human behavior? https://www.youtube.com/watch?v=MknV3t5QbUc (2019). [↩]
- C. F. Camerer, H. Teck-Hua, C. Juin-Kuan. A Cognitive Hierarchy Model of Games. The Quarterly Journal of Economics (2004). [↩]
- L. Smith, A. Wu. How to Win at Rock, Paper, Scissors. https://www.wikihow.com/Win-at-Rock,-Paper,-Scissors#:~:text=Switch%20up%20your%20own%20move,often%2C%20try%20throwing%20rock%20instead. (2024). [↩]
- H. Zhang, F. Moisan, C. Gonzalez. Rock-Paper-Scissors Play: Beyond the Win-Stay/Lose-Change Strategy. Games. 12(3), 52 (2021). https://doi.org/10.3390/g12030052. [↩]
- Stahl II, D. O., & Wilson, P. W. Experimental evidence on players’ models of other players. Journal of Economic Behavior & Organization, 25(3), 309-327 (1994). [↩] [↩]
- N. Case. The Evolution of Trust. https://ncase.me/trust/. (2017). [↩]
- S. H. Fuller, J. G. Gaschnig, J. J. Gillogly. Analysis of the alpha-beta pruning algorithm. Carnegie Mellon University Department of Computer Science. 2-14 (1973). [↩]