Which Machine-Learning Model Do You Want In Your Ocean’s Eleven: A Computational Prisoner’s Dilemma Simulation
Loading...
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Which machine-learning model is the best at winning the prisoner’s dilemma? Which models create the best cumulative outcomes? Is there a model that perfectly captures both winning and cumulative points? These are questions generated from a simple 2 x 2 payoff matrix of the prisoner’s dilemma. Imagine you and your partner in crime are caught and sent independently into questioning. You can either collaborate or defect, but you don’t know what your partner will do. If you both collaborate, cumulatively, you’ll each get one year in jail. If you defect and your partner collaborates, you’ll serve no time and they will serve 10 years, and vice versa. If you both defect, you’ll both serve 5 years in jail. Placing AIs against each other to play just one round doesn’t reveal much about their code, strategy, and end goal. So the Reinforcement Learning, Pattern Learning, Tit For Tat, and other models were put up against each other in a 100 round game where their behavior, convergence, and learning were analyzed to reveal the most effective ways strategies to beat the prisoner’s dilemma. All code and data is open sourced here.
Description
Keywords
Computer Science, Game Theory