Grade Level at Time of Presentation
Sophomore
Institution
Western Kentucky University
KY House District #
23
KY Senate District #
32
Faculty Advisor/ Mentor
Dr. Uta Ziegler, PhD
Department
The School of Engineering and Applied Sciences
Abstract
The Monte-Carlo tree search (MCTS) is a method designed to solve difficult learning problems. MCTS performs random simulations from the current situation and stores the results in order to distinguish decisions based on their past success. MCTS will then select the best decision and finally repeat the process. Parallelizing the MCTS means to divide the learning process among independent learners. Then, after a fixed number of simulations, the data is shared and combined. Past research has shown that this approach is faster than non-parallelized approaches. Therefore, we anticipated that the time reduced from dividing the learning outweighs the potential costs from redundant learning. Since it is often difficult to determine the effectiveness of algorithms in complex environments, it is sometimes more advantageous to develop strategies in simple environments such as games that can then be translated for use in broader real-life fields. In this project, we explored how controlling various resources affected the win-ratio performance of the game Dots and Boxes learned through a parallelized Monte Carlo Tree Search approach. The factors that we manipulated included the number of simulations, the number of independent learners, the amount of information shared from these independent learners, and how frequently the independent learners share. The win-ratio performance was determined by taking the number of wins over the number of total games. An algorithm is presented with our findings, along with details and results of our modified Monte-Carlo tree search implementation.
Included in
Performance of the Parallelized Monte-Carlo Tree Search Approach for Dots and Boxes
The Monte-Carlo tree search (MCTS) is a method designed to solve difficult learning problems. MCTS performs random simulations from the current situation and stores the results in order to distinguish decisions based on their past success. MCTS will then select the best decision and finally repeat the process. Parallelizing the MCTS means to divide the learning process among independent learners. Then, after a fixed number of simulations, the data is shared and combined. Past research has shown that this approach is faster than non-parallelized approaches. Therefore, we anticipated that the time reduced from dividing the learning outweighs the potential costs from redundant learning. Since it is often difficult to determine the effectiveness of algorithms in complex environments, it is sometimes more advantageous to develop strategies in simple environments such as games that can then be translated for use in broader real-life fields. In this project, we explored how controlling various resources affected the win-ratio performance of the game Dots and Boxes learned through a parallelized Monte Carlo Tree Search approach. The factors that we manipulated included the number of simulations, the number of independent learners, the amount of information shared from these independent learners, and how frequently the independent learners share. The win-ratio performance was determined by taking the number of wins over the number of total games. An algorithm is presented with our findings, along with details and results of our modified Monte-Carlo tree search implementation.