- Researchers develop the first AI named Pluribus that can defeat human experts in a multiplayer game.
- Although it is specifically designed for the most popular poker format, it can defeat humans in other multiplayer games.
- The AI can be used in other fields, including cybersecurity, fraud preventions, pricing products, and routing self-driving vehicles.
In the recent decade, there have been great strides in artificial intelligence (AI). Games like Go and Chess have become a standard way to evaluate the progress in AI.
So far, almost all AI models have conquered two-player games in which opponents’ moves are clearly visible. The most popular forms of poker, on the other hand, involve multiple players and combine gambling, strategy, and skill.
Now, researchers at Facebook and Carnegie Mellon University have developed an AI bot that can outwit a whole table of poker professionals using strategic settings. This is the first AI bot — named Pluribus — to out-bluff and out-bet human experts in a six-player game, no-limit Hold’em, the most popular format of poker.
Pluribus played 5,000 hands against poker experts (including two winners of the World Series) and won decisively. The AI was able to adopt impressive strategies (such as donk betting) and bluff like a seasoned pro.
In fact, it was so successful the developers have decided not to publish its code for fear it could wreck the online poker industry. The algorithm was so strong human experts could not find anything to exploit.
This is the first time AI has defeated top professionals in any benchmark game that has two or more players. The team has been working on this project for years. In 2017, they came up with a bot capable of playing one-on-one poker. Pluribus is a much more complex version of that bot.
Pluribus Is More Than Just Brute Computation
The core of Pluribus’s game-plan is generated via self-play: the algorithm plays against copies of itself and gradually improves as it determines which actions lead to better outcomes.
This kind of self-play approach generates a blueprint strategy for the whole game offline. Then during the actual play against humans, the AI improves upon the strategy in real-time by searching for a strategy for similar situations it finds itself in during the game.
Pluribus’s blueprint strategy gradually improves during training on a 64-core processor
The AI is comprised of new online search algorithms to efficiently calculate its options by searching only a limited set of next-moves instead of all possible moves. It also incorporates faster self-play algorithms for games with hidden information.
Together these algorithms enable Pluribus to be trained on a less powerful computer with fewer resources. To put this into context, it requires about $150 worth of cloud computing resources to train Pluribus, whereas other recent AI breakthrough models require millions of dollars’ worth of computing resources to train.
The algorithms used to conquer poker can be implemented in other areas as well, such as pricing products, trading, and routing self-driving vehicles through busy traffic. These algorithms can also be applied to defeat humans in other multiplayer games and to develop more interesting computer games.