Author(s): Joonkyum Lee, Gun Jea Yu
The optimal decision on exploration-exploitation is critical for the success of organizations. The optimal strategy is determined by environmental structures, such as the difference in the success probability between good and bad alternatives, the sparsity of alternatives, as well as the relative value of selecting good alternatives. Nevertheless, the dynamics between different environment structures and the performance of exploration-exploitation strategies have been little explored. We use a simulation experiment based on the multi-armed bandit model to investigate the performance of exploration-exploitation strategies. We find that a high level of exploration is beneficial where success probabilities differ among alternatives and when superior alternatives are sparse. The relative performance gap between the optimal strategy and suboptimal strategies grows as the relative value of superior alternatives grows. We show the underlying mechanism of the pattern of optimal exploration level in different environmental structures.