Multi-armed bandit and Thompson Sampling in C++ and Python
“Multi-armed bandit” (MAB) is a statistical problem where given several slot machines (sometimes known as “one armed bandits”, because they “rob” the player) with unknown payouts, a gambler has to decide on which machines to play, in which order, and how many times to maximize the payout.