Trevor Savage has played poker professionally for 15 years, winning millions of dollars in the process. While he typically takes on humans, he faced a daunting new opponent in June: a powerful bot developed by researchers at Carnegie Mellon University and Facebook AI Research to trounce the world’s top players.
Savage and a dozen other professional poker players — all male, all playing remotely online — spent hours per day over 12 days last month, hunched over their computer screens, trying their best to beat an artificial intelligence system dubbed Pluribus. The humans were paid for their work: $50,000 divided among them, depending on how well they fared.
They were playing the most popular form of poker: no limit Texas Hold ‘Em. There were six players per game (sometimes five humans would play against Pluribus; sometimes five versions of the bot would play against one human). Over the course of 10,000 hands of poker, the AI system was a fierce competitor, winning in both types of play by a decisive margin, according to co-creator Noam Brown, a research scientist at Facebook AI Research.
Savage, who plays from his home office in West Deptford, New Jersey, didn’t do so hot, but he was impressed by Pluribus’ style.
“It was clear the bot was a fundamentally sound, winning player,” he told CNN Business. “It mixed in strategies that most of the high stakes winning players would mix in.”
The feat represents the first time AI has beaten top human players in a poker game with this many players. Brown believes Pluribus provides a benchmark for the broader question of how we can get AI to deal with imperfect information in complicated environments — whether those environments are games or in the real world. A research paper about Pluribus was published Thursday in the journal Science.
AI has been beating human players at games ranging from chess to Go to video games like Starcraft for years. Yet AI is typically attempting to dominate two-player games, and many of them (chess and Go in particular) are what are known as “complete information” games, since all the players have the same amount of information.
In poker, however, you can’t know all the information that your opponent knows, so it’s more difficult to anticipate what moves they may make — and it only gets more difficult the more players you have. These factors make poker a lot harder of a game for computers to master.
Brown created Pluribus, which is Latin for “many,” with Tuomas Sandholm, a computer science professor at Carnegie Mellon University who also founded several companies to commercialize his work in AI. The system was trained by having the AI play the game against copies of itself, without knowing how to play the game and improving as it went. The researchers previously unveiled a bot called Libratus (meaning “balanced” in Latin) in 2017 that beat four leading players at no-limit Texas Hold ‘Em, but that was in a two-player version of the game, making Pluribus’ success a clear advancement in game-playing AI.
Brown thinks the technology behind Pluribus could eventually be used for applications that can involve multiple people and hidden information: think anything from fraud detection to self-driving cars.
Michael Wellman, a professor at the University of Michigan who focuses on game theory, said Pluribus’s success against human players is a pretty big deal.
“It’s an impressive technical achievement,” he said, adding that the AI underpinning Pluribus could be used for negotiations, cyber security or military strategy.
In fact, one of Sandholm’s companies, a startup called Strategy Robot that aims to come up with government applications for his AI game-playing work, already has a contract worth as much as $10 million with the US military. Brown said Pluribus would not be used for that particular application. (He says his employer is interested in this kind of research chiefly to drive forward our understanding of AI.)
Though real-world applications for Pluribus may be a ways out, there are some poker-related tips that humans can take from it today, Brown said. For instance, it would, in some situations, bet much higher amounts of money than humans tend to — a move that pros indicated could be smart in some cases. And it went against conventional poker wisdom by determining that a strategy known as “donk betting,” where a player begins a round by betting after ending the previous round with a call, could be a good play.
“I’ve obviously gotten better, that’s for sure,” Brown said.