Research summary: Aligning Super Human AI with Human Behavior: Chess as a Model System

Summary contributed by Brooke Criswell (@Brooke_Criswell). She’s pursuing a PhD. in media psychology, and has extensive experience in marketing & communications.

*Reference at the bottom

Artificial Intelligence (AI) is becoming smarter and smarter every day. In some cases, AI is achieving or surpassing superhuman performance. AI systems typically approach problems and decision making differently than the way people do (McIlroy-Young, Sen, Kleinberg, Anderson, 2020). The researchers in this study (McIlroy-Young, Sen, Kleinberg, Anderson, 2020) created a new model that explores human chess players’ behavior at a move-by-move level and the development of chess algorithms that match the human move-level behavior. In other words, the current systems for playing chess online is designed to play the game to win.

However, in this research study, they found a way to align the AI chess player to play in a more human behavioral manner when making decisions on the next move in chess. They found by applying existing chess engines to the data they had did not predict human movements very well. Therefore, their system called “Maia” is a customized version of Alpha-Zero trained on human chess games that predict human moves with a high accuracy rate than existing engines that play chess. They also achieve maximum accuracy when predicting the decisions made by players at specific skill levels. They take a dual approach when designing this algorithm. Instead of asking, “what move should be played?” They are asking, “What move will a human play?”

The researchers were able to do this by repurposing the Alpha Zero deep neural network framework to predict human actions rather than the most likely winning move. Instead of training the algorithm on self-play games, they taught it on human games that were already recorded in datasets to understand how humans play chess. The next part was creating the policy network that was responsible for the prediction. From this, “Maia” was built and has “natural parametrization under which it can be targeted to predict human moves at a particular skill level” (McIlroy-Young, Sen, Kleinberg, Anderson, 2020).

The second task for “Maia” they developed was to figure out when and whether human players would make a significant mistake on their next move, called “blunder.”

For this study, they designed a custom deep residual neural network and trained it on the same data. They found that the network outperforms competitive baselines at predicting whether humans will make a mistake (McIlroy-Young, Sen, Kleinberg, Anderson, 2020).

By designing AI with human collaboration in mind, one can accurately model granular human decision making. The choices developers make in the design can lead to this type of performance. It can also help understand the prediction of human error (McIlroy-Young, Sen, Kleinberg, Anderson, 2020).

Reid McIlroy-Young, Siddhartha Sen, Jon Kleinberg, and Ashton Anderson. 2020. Aligning Superhuman AI with Human Behavior: Chess as a Model System. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’20), August 23–27, 2020, Virtual Event, CA, USA. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3394486.3403219