The AI program called AlphaHoldem equaled four sophisticated human players in a 10,000-hand two-player competition, after three days of self-training, according to a paper to be presented at AAAI 2022, a global AI conference to be held in Vancouver in February next year.
Texas hold'em is a popular poker game in which players often deceive and bluff. It is more similar to real-world problems than go and chess since decisions are made with imperfect information.
The researchers from the Institute of Automation under the Chinese Academy of Sciences (CAS) reported that AlphaHoldem, a fast learner, used only about three to four milliseconds for each movement, about 1,000 times quicker than that of first-generation AI hold'em players DeepStack and Libratus.
AlphaHoldem got the better of DeepStack in a 100,000-hand competition, according to the researchers.
DeepStack, developed by the University of Alberta and Libratus, developed by Carnegie Mellon University, beat professional players in heads-up no-limit two-player hold'em in 2016 and 2017.
The two previous AI players, based on an algorithm called counterfactual regret minimization, spent respectively three and four seconds for each movement, consuming a large amount of computing power, the researchers said.
AlphaHoldem, which employs a new framework by incorporating deep-learning into a new self-play algorithm, used only eight GPUs during its training, which is ultra-lightweight compared with DeepStack's 13,000 GPUs, according to the CAS's recent news release.
The researchers said looking forward, they will apply the underlying technology to other games like mahjong and bridge, fostering smarter AI.