Grade Level at Time of Presentation

Sophomore

Institution

Western Kentucky University

KY House District #

23

KY Senate District #

32

Department

The School of Engineering and Applied Sciences

Abstract

The Monte-Carlo tree search (MCTS) is a method designed to solve difficult learning problems. MCTS performs random simulations from the current situation and stores the results in order to distinguish decisions based on their past success. MCTS will then select the best decision and finally repeat the process. Parallelizing the MCTS means to divide the learning process among independent learners. Then, after a fixed number of simulations, the data is shared and combined. Past research has shown that this approach is faster than non-parallelized approaches. Therefore, we anticipated that the time reduced from dividing the learning outweighs the potential costs from redundant learning. Since it is often difficult to determine the effectiveness of algorithms in complex environments, it is sometimes more advantageous to develop strategies in simple environments such as games that can then be translated for use in broader real-life fields. In this project, we explored how controlling various resources affected the win-ratio performance of the game Dots and Boxes learned through a parallelized Monte Carlo Tree Search approach. The factors that we manipulated included the number of simulations, the number of independent learners, the amount of information shared from these independent learners, and how frequently the independent learners share. The win-ratio performance was determined by taking the number of wins over the number of total games. An algorithm is presented with our findings, along with details and results of our modified Monte-Carlo tree search implementation.

Share

COinS
 

Performance of the Parallelized Monte-Carlo Tree Search Approach for Dots and Boxes

The Monte-Carlo tree search (MCTS) is a method designed to solve difficult learning problems. MCTS performs random simulations from the current situation and stores the results in order to distinguish decisions based on their past success. MCTS will then select the best decision and finally repeat the process. Parallelizing the MCTS means to divide the learning process among independent learners. Then, after a fixed number of simulations, the data is shared and combined. Past research has shown that this approach is faster than non-parallelized approaches. Therefore, we anticipated that the time reduced from dividing the learning outweighs the potential costs from redundant learning. Since it is often difficult to determine the effectiveness of algorithms in complex environments, it is sometimes more advantageous to develop strategies in simple environments such as games that can then be translated for use in broader real-life fields. In this project, we explored how controlling various resources affected the win-ratio performance of the game Dots and Boxes learned through a parallelized Monte Carlo Tree Search approach. The factors that we manipulated included the number of simulations, the number of independent learners, the amount of information shared from these independent learners, and how frequently the independent learners share. The win-ratio performance was determined by taking the number of wins over the number of total games. An algorithm is presented with our findings, along with details and results of our modified Monte-Carlo tree search implementation.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.