From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI
This paper reviews the field of Game AI, which not only deals with creating agents that can play a certain game, but also with areas as diverse as creating game content automatically, game analytics, or player modelling. While Game AI was for a long time not very well recognized by the larger scientific community, it has established itself as a research area for developing and testing the most advanced forms of AI algorithms and articles covering advances in mastering video games such as StarCraft 2 and Quake III appear in the most prestigious journals. Because of the growth of the field, a single review cannot cover it completely. Therefore, we put a focus on important recent developments, including that advances in Game AI are starting to be extended to areas outside of games, such as robotics or the synthesis of chemicals. In this article, we review the algorithms and methods that have paved the way for these breakthroughs, report on the other important areas of Game AI research, and also point out exciting directions for the future of Game AI.
∎\addbibresourcereview \AtBeginBibliography \newmdenv[innerlinewidth=0.5pt, roundcorner=4pt,linecolor=black,innerleftmargin=6pt, backgroundcolor=black!10, innerrightmargin=6pt,innertopmargin=6pt,innerbottommargin=6pt]mybox
For a long time, games research and especially research on Game AI was in a niche, largely unrecognized by the scientific community and the general public. Proponents of Game AI research wrote advertisement articles to justify the research field and substantiate the call for strengthening it (e.g. lucas2006evolutionary). The main arguments have been these:
By tackling game problems as comparably cheap, simplified representatives of real world tasks, we can improve AI algorithms much easier than by modeling reality ourselves.
Games resemble formalized (hugely simplified) models of reality and by solving problems on these we learn how to solve problems in reality.
Both arguments have at first nothing to do with games themselves but see them as a modeling / benchmarking tools. In our view, they are more valid than ever. However, as in many other digital systems, there has also been and still is a strong intrinsic need for improvement because the performance of Game AI methods was in many cases too weak to be of practical use. This could be both in terms of playing strength, or simply because they failed to produce believable behavior Livingstone2006. The latter would be necessary to hold up the suspension of disbelief, or, in other words, the illusion to willingly be immersed in a game world.
But what exactly is Game AI? Opinions on that have certainly changed in the last 10 to 15 years. For a long time, academic research and game industry were largely unconnected, such that neither researchers tackled AI-related problems game makers had nor the game makers discussed with researchers what these problems actually were. Then, in research some voices emerged, calling for more attention for computer Game AI (partly as opposed to board game AI), including Nareyek Nareyek2001; Nareyek2007, Mateas Mateas2003, Buro Buro2003, and also Yannakakis Yannakakis2012.
Proponents of a change included Alex Champandard in his computational intelligence and games conference (CIG) 2010 tutorial YannakakisT11 and Youichiro Miyake in his GameOn Asia 2012 keynote
Champandard and Miyake both argued that research shall try to tackle problems that are actually relevant also for the games industry.
This led to a shift in the focus of Game AI research that was further intensified by a series of Dagstuhl meetings on Game AI that started in 2012
The 2018 book on AI and Games gameAIbook shows the pre-game (design / production) during game (game playing) and after-game (player modeling / game analysis)
Most of the popular known big recent successes are connected to big AI-heavy IT companies entering the field such as DeepMind (Google), Facebook AI and OpenAI. Equipped with rich computational and human resources, these new players have especially profited from Deep (Reinforcement) Learning to tackle problems that were previously seen as important milestones for AI, successfully tackling difficult problems of human decision making, such as Go, Dota2, and StarCraft.
It is, however, a fairly open question how we can utilize these successes for solving other problems in Game AI and beyond. As it appears to be possible but utterly difficult to transfer whole algorithmic solutions, e.g., for a complex game as StarCraft, to a completely different domain, we may rather see innovative recombinations of algorithms from the recently enriched portfolio in order to craft solutions for new problems.
In the next sections, we start with enlisting some important terms that will be repeatedly used (Sect.2) before tackling state / action based learning in Sect. 3. We then report on pixel-based learning in Sect. 4. At this point, PCG comes in as a flexible testbed generator (Sect. 5). However, it is also a viable aim on its own to be able to generate content. Very recently, different sources of game information, such as pixel and state information, are given as input to these game-playing agents, providing better methods for rather complex games (Sect. 6). While many approaches are tuned to one game, others explicitly strive for more generality (Sect. 7). Next to game playing and generating content, we also shortly discuss AI in other roles (Sect. 8). We conclude the article with a short overview of the most important publication venues and test environments in Sect. 9 and some reasoning about the expected future developments in Game AI in Sect. 10.
2 Algorithmic approaches and game genres
We provide an overview of the predominant paradigms / algorithm types and game genres, focusing mostly on game playing and more recent literature. These algorithms are used in many other contexts of AI and application areas of course, but some of their most popular successes have been achieved in the Game AI field.
Reinforcement Learning (RL). In reinforcement learning an agent learns to perform a task through interactions with its environment and through rewards. This is in contrast to supervised learning, in which the agent is directly told the correct action in different states. One of the main challenges in RL is to find a balance between exploitation (i.e. seeking out states that are known to give a high reward) vs. exploration (i.e. trying out something new that might lead to higher rewards in the long run).
Deep Learning (DL). Deep learning is a broad term and comes in a variety of different shapes and sizes. The main distinguishing feature of deep learning is the idea to learn progressively higher-level features through multiple layers of non-linear processing. The most prevalent deep learning methods are based on deep neural networks, which are artificial neural networks with multiple different layers (in new neural network models these can be more than 100 layers). Recent advances in computing power, such as more and more efficient GPUs (which were first developed for fast rendering of 3D games), more data, and various training improvements have allowed deep learning methods to surpass the previous state-of-the-art in many domains such as image recognition, speech recognition or drug discovery. LeCun et al. lecun2015deep provide a good review paper on this fast-growing research area.
Deep Reinforcement Learning. Deep Reinforcement Learning combines reinforcement learning with deep neural networks to create efficient algorithms that can learn directly from high-dimensional sensory streams. Deep RL has been the workhorse behind many of the recent advances in Game AI, such as beating professional players in StarCraft and Dota2. arulkumaran2017brief provides a good overview over deep RL.
Monte Carlo Tree Search (MCTS). Monte Carlo Tree Search is a fairly recent Coulom06 randomized tree search algorithm. States of the mapped system are nodes in the tree, and possible actions are edges that lead to new states. In contrast to older methods such as alpha-beta pruning, it does not attempt to look at the full tree but uses controlled exploration and exploitation of already obtained knowledge (successful branches are preferred) and often fully randomized playouts, meaning that a game is played until it ends by applying randomized actions. If that takes too long, state value heuristics can be used alternatively. Loss/win information is propagated upwards up to the tree root such that estimations of the win ratio at every node get available for directing the search. MCTS can thus be applied to much larger trees, but provides no guarantees concerning obtaining optimal solutions. Browne2012 is a popular introductory survey.
Evolutionary Algorithms (EA). Also known as bio-inspired optimization algorithms, Evolutionary Algorithms take inspiration from natural evolution for solving black-box optimization problems. They are thus applied when classical optimization methods fail or cannot be employed because no gradient or not even numeric objective value information (but ranking of solutions) is available. A key idea of EAs is parallel search by means of populations of candidate solutions, which are concurrently improved, making it a global optimization method. EAs are especially well suited for multi-objective optimization, and the well-known GA, NSGA-II, CMA-ES algorithms are all EAs, see also the introduction/survey book Eiben2015.
Which are the most important games to serve as testbeds in Game AI? The research-oriented frameworks general game playing (GGP), general video Game AI (GVGAI) and the Atari learning environment (ALE) play an important role but are somewhat far from modern video games. This also holds true for the traditional AI challenge board games Chess and Go and card games as Poker or Hanabi. In video games, the predominant genres are real-time strategy (RTS) games such as StarCraft, Multiplayer online battle arena (MOBA) games such as Dota2, and first person shooter (FPS) games such as Doom. Sports games currently get more important liu2019 as they often represent a competitive team situation that is seen as similar to many real-world human/AI collaborative problems. In a similar way, cooperative (capture-the-flag) variants of FPS games jaderberg2019human are used. Figures 1 and 2 provide an overview of the different properties of the games used as AI testbeds.
3 Learning to play from states and actions
Games have for a long time served as invaluable testbeds for research in artificial intelligence (AI). In the past, particularly board games such as Checkers and Chess have been tackled, later on turning to Go when Checkers had been solved Schaeffer2007 and with DeepBlue campbell2002deep an artificial intelligence had defeated the world champion in Chess consistently. All these games and many more, up to Go, have one thing in common: they can be expressed well by states and actions, where the number of actions is usually a not-too-large number of often around 100 or less reasonable moves from any possible position. For quite some time, board games have been tackled with alpha-beta pruning (Turing Award Winners Newell and Simon explain in newellturing how this idea came up several times almost at once) and very sophisticated and extremely specialized heuristics before Coulom invented Monte Carlo Tree Search (MCTS) Coulom06 in 2006. MCTS gives up optimality (full exploration) in exchange for speed and is therefore now dominating AI solutions for larger board games such as Go with about possible states (board positions). MCTS-based Go algorithms had greatly improved the state-of-the-art up to the level of professional players by incorporating sophisticated heuristics as Rapid Action Value Estimation (RAVE) Gelly2011. In the following, MCTS based approaches were shown to cope well also with real-time conditions as in the PacMan game Pepels2014 and also hidden information games Powley2014.
However, only the combination of MCTS with DL led to a world-class professional human-level Go AI player named AlphaGo Silver2016.
At this stage, human experience (recorded grandmaster games) had been used for ”seeding” the learning process that was then accelerated by self-play. By playing against itself, the AlphaGo algorithm was able
to steadily improve its value (how good is the current state?) and policy (what is the best action to play?) artificial neural networks.
The next step, AlphaGo Zero silver2017mastering removed all human data, relying on self-play alone, and learned to play Go better than the original AlphaGo approach but from scratch.
This approach has been further developed to AlphaZero Silver2018
and shown to be able to learn to play different games, next to Go also Chess and Shogi (Japanese Chess).
In-depth coverage of most of these developments is also provided in Plaat2020
From the last paragraphs, it may appear as if learning via self-play is limited to two-player perfect information games only. However, also multi-player partial information games such as Poker Brown2019 and even cooperative multi-player games such as Hanabi Lerer2019 have recently been tackled and AI players now exist that can play these games at the level of the best human players. Thus, is self-play the ultimate AI solution for all games? Seemingly not, as vinyals2019 suggests (see Sect. 6). However, this may be a question of the number of actions and states in a game and remains to be seen. Nevertheless, board games and card games are obviously good candidates for such AI approaches.
4 Learning to play from pixels
For a long time, learning directly from high-dimensional input data such as the pixels of a video game was an unsolved challenge. Earlier neural network-based approaches for playing games such as Pac-Man relied on careful engineered features such as the distance to the nearest ghost or pill, which are given as input to the neural network risi2015neuroevolution.
While some earlier game-playing approaches, especially from the evolutionary computation community, showed initial success in learning directly from pixels parker2012neurovisual; gallagher2007evolving; togelius2009super; hausknecht2014neuroevolution, it was not until DeepMind’s seminal paper on learning to play Atari video games from pixels Mnih2013; mnih2015human that these approaches started to compete and at times outperform human players. Serving as a common benchmark, many novel AI algorithms have been developed and compared on Atari video games first justesen19review before being applied to other domains such as robotics akkaya2019solving. A computationally cheap and thus interesting end-to-end pixel-based learning environment is VizDoom kempka2016vizdoom, a competition setting that relies on a rather old game that is run in very small screen resolutions. Low resolution pixel inputs are also employed in the obstacle tower challenge (OTC) Juliani2019.
DeepMind’s paper ushered in the area of Deep Reinforcement Learning, combining reinforcement learning with a rich neural network-based representation (see infobox for more details). Deep RL has since established itself as the prevailing paradigm is to learn directly from high-dimensional input such as images, videos, or sounds without the need for human-design features or preprocessing. More recently, approaches based on evolutionary algorithms have shown to also be competitive with approaches based on gradient descent-based methods such2017deep.
However, some of the Atari games, namely Montezuma’s Revenge, Pitfall, and others proved to be too difficult to solve with standard deep RL approaches mnih2015human because of sparse and/or late rewards. These hard-exploration games can be handled successfully by evolutionary algorithms that explicitly favor exploration such as Go-Explore ecoffet2019go.
A recent trend in deep RL is to allow agents to learn a general model of how their environment behaves and use that model to explicitly plan ahead. For games, one of the first approaches was the World Model introduced by ha2018world, in which an agent learns to solve a challenging 2D car racing game and a 3D VizDoom environment from pixels alone. In this approach, the agent first learns by collecting observations from the environment, and then training a forward model that takes the current state of the environment and action and tries to predict the next state. Interestingly, this approach also allowed an agent to get better by training inside a hallucinated environment created through a trained world model.
Instead of first training a policy on random rollouts, follow-up work showed that end-to-end learning through reinforcement learning hafner2018learning and evolution risi2019gecco; risi2019improving is also possible. We will discuss MuZero as another example of planning in latent space in Section 6.
5 Procedural content generation
In addition to playing games, another active area of AI research is procedural content generation (PCG) PCGbook; risi2019procedural. PCG refers to the algorithmic creation of game content such as levels, textures, quests, characters, or even the rules of the game itself.
One of the appeals of employing PCG in games is that it can increase their replayability by offering the player a new experience every time they play. For example, games such as No Man’s Sky (Hello Games, 2016) or Spelunky (Mossmouth, LLC, 2013) famously featured PCG as part of their core gameplay, allowing players to explore an almost unlimited variety of planets or caves. One of the most important early benefits of PCG methods was that it allowed the creation of larger game worlds than what would normally fit on a computer’s hard disk at the time. One of the first games using PCG-based methods was Elite (Brabensoft, 1984), a space trading video game featuring thousands of planets. The whole starsystem with each visited planet and space stations could be recreated from a given random seed.
While the origin of PCG is rooted in creating a more engaging experience for players yannakakis2011experience, more recently PCG-based approaches have also found important other use cases. With the realisation that methods such as deep reinforcement learning can surpass humans in many games, also came the realisation that these methods overfit to the exact environment they are trained on justesen2018illuminating; zhang2018study. For example, an agent trained to reach the level of a human expert in a game such as Breakout, will fail completely when tested on a Breakout version where the game pedal has a slightly different size or is at a slightly different position. Recent research showed that by training agents on many procedurally generated levels allows them to become significantly more general justesen2018illuminating. In an impressive extension of this idea, DeepMind trained agents on a large number of randomly created levels to reach human-level performance in the Quake III Capture the Flag game jaderberg2019human. This trend to make AI approaches more general by training them on endless variations of environments was continued in the hide-and-seek work by OpenAI Baker2019 and also in the obstacle tower challenge (OTC) Juliani2019 and will certainly also be employed in many future approaches.
Meanwhile, PCG has been applied to many different types of game components or facets (e.g. visuals, sound), but most often to only one of these at once. One of the open research questions in this context is how generators for different facets can be combined Liapis2019.
Similar to some of the other techniques described in this article, PCG has also more recently found to be applicable to areas outside of games risi2019procedural. For example, training a humanoid robot hand to manipulate a Rubik’s cube in a simulator on many variants of the same problem (e.g. varying parameters such as the size, mass, and texture of the cube) has allowed a policy trained in a simulator to sometimes work on a physical robot hand in the real world. For a review of how PCG has increased generality in machine learning we refer the interested reader to this survery risi2019procedural and for a more in-depth review of PCG in general to the book by Shaker et al. PCGbook.
6 Merging state and pixel information
Whereas the AI in AlphaGo and its predecessors for playing board games dealt with board positions and possible moves, deep RL and recent evolutionary approaches for optimising deep neural networks (a research field now referred to as deep neuroevolution stanley2019designing), learn to play Atari games directly from pixel information. On the one hand, these approaches have some conceptual simplicity, but on the other hand, it is intuitively clear that adding more information – if available – may be of advantage. More recently, these two ways of obtaining game information were joined in different ways.
The hide-and-seek approach Baker2019 depends on visual and state information of the agents but also heavily on the use of co-evolutionary effects in a multi-agent environment that very much reminds of EA techniques.
In AlphaStar (Fig. 3) that was designed to play StarCraft at human professional level, both state information (location and status of units and buildings) as well as pixel information (minimap) is fed into the algorithm. Interestingly, self-play is used heavily, but is not sufficient to generate human professional competitive players because the strategy space is huge and human opponents may come up with very different ways to play the game that must all be handled. Therefore, as in AlphaGo, human game data is used to seed the algorithm. Furthermore, also co-evolutionary effects in a 3 tier league of different types of agents are driving the learning process. It shall be noted that the success of AlphaStar was hard to imagine only some years ago because RTS games were considered the hardest possible testbeds for AI algorithms in games Ontanon2013. These successes are, however, not without controversy and people argue if the comparisons of AIs playing against humans are fair justesen2019we; canaan2019leveling.
MuZero schrittwieser2019mastering is able to learn playing Atari games (pixel input) as well as Chess and Go (state input) by generating virtual states according to reward/position value similarity. These are managed in a tree-like fashion as in MCTS but costly rollouts are avoided. The elegance of this approach lies in the ability to use different types of input and the construction of an internal representation that is oriented only at values and not at exact game states.
7 Towards more general AI
While AI algorithms have become exceedingly good at playing specific games justesen19review, it is still an unsolved challenge how to make an AI algorithm that can learn to quickly play any game it is given, or how to transfer skills learned in one game to another. This challenge, also known as General Video Game Playing genesereth2005general, has resulted in the development of the General Video Game AI framework (GVGAI), a flexible framework designed to facilitate the development of general AI through video game playing perez2016general.
With increasingly complicated worlds and graphics, video games might be the ideal environment to learn more general intelligence. Another benefit of games is that they often share similar controllers and goals. To spur developments in this area, the GVGAI framework now also includes a Learning Track, in which the goal of the agent is to learn a new game quickly without being trained on it beforehand. The hope is that methods that can quickly learn any game they are given, will also ultimately be able to quickly learn other tasks such a robot manipulation in the real world.
Whereas most successful approaches for GVGAI games employ MCTS, it shall be noted that there are also other competitive approaches as the rolling horizon evolutionary algorithm (RHEA) PerezSLR13 that evolve partial action sequences as a whole through an evolutionary optimization process. Furthermore, DL variants start to get used here as well torrado2018.
8 AI for player modelling and other roles
In this section, we briefly mention a few other use cases for current AI methods. In addition to learning to play or generating games and game content, another important aspect of Game AI – and potentially currently the main use case in the game industry – is game analytics. Game analytics has changed the game landscape dramatically over the last ten years. The main idea in game analytics is to collect data about the players while they play the game and then update the game on the fly. For example, the difficulty of levels can be adjusted or the user interface can be streamlined. At what point players stopped playing the game can be an important indication of what to change to reduce the game’s churn
Another important application area of Game AI is player modelling. As the name suggests, player modelling aims to model the experience or behavior of the player bakkes2012player; yannakakis2013player. One of the main motivations for learning to model players is that a good player model can allow the game to be tailored even more to the individual player. A variety of different approaches to model players exist, such as supervised learning (e.g. training a neural network in a supervised way on recorded plays of human players to behave the same way), to unsupervised approaches such as clustering that aim to group similar players together drachen2009player. Based on which cluster a new player belongs to, different content or other game adaptations can be performed. Combining PCG (Sect. 5) with player modelling, an approach called Experience-Driven Procedural Content Generation yannakakis2011experience, allows these algorithms to automatically generate unique content that induces a desired experience for a player. For example, pedersen2010modeling trained a model on players of Super Mario, which could then be used to automatically generate new Mario levels that maximise the modelled fun value for a particular player. Exciting recent work can even predict a player’s affect in certain situation from pixels alone makantasis2019pixels.
There is also a large body of research on human-like non-player characters (NPC) hingston2012, and some years ago, this research area was at the core of the field, but with the upcoming interest in human/AI collaboration it is likely to thrive again in the next years.
Other roles for Game AI include playtesting and balancing which both belong to game production and mostly happen before games are published. Testing for bugs or exploits in a game is an interesting application area of huge economic potential and some encouraging results exist denzinger2005dealing. With the rise of machine learning methods that can play games at a human or beyond human level and methods that can solve hard-exploration games such as Montezuma’s Revenge ecoffet2019go, this area should see a large increase of interest from the game industry in the coming years. Mixed-initiative tools that allow humans to create game content together with a computational creator often include an element of automated balancing, such as balancing the resources on a map in a strategy game liapis2013sentient. Game balancing is a wide and currently under-researched area that may be understood as a multi-instance parameter tuning problem. One of the difficulties here is that many computer games do not allow headless accelerated games and APIs for controling these. Some automated approaches exist for single games PreussPVP18 but they usually cannot cope with the full game and approaches for more generally solving this problem are not well established yet Volz2019. Dynamic re-balancing during game runtime is usually called dynamic difficulty adaptation (DDA) spronck2006adaptive.
9 Journals, conferences, and competitions
The research area of Game AI is centered in computer science, but influenced by other disciplines as i.e. psychology, especially when it comes to handling humans and their emotions yannakakis2014emotion; yannakakis2018ordinal. Furthermore, (computational) art and creativity (for PCG), game studies (formal models of play) and game design are important neighboring disciplines.
In computer science, Game AI is not only limited to machine learning and traditional branches of AI but also has links to information systems, optimization, computer vision, robotics, simulation, etc. Some of the core conferences for Game AI are:
Foundations of Digital Games (FDG)
IEEE Conference on Games (CoG), until 2018 the Conference on Computational Intelligence and Games (CIG)
Artificial Intelligence for Interactive Digital Entertainment (AIIDE)
Also, many computer science conferences have tracks or co-located smaller conferences on Game AI, as e.g. GECCO and IJCAI.
The more important journals in the field are the
IEEE Transactions on Games ToG (formerly TCIAIG) and the
IEEE Transactions on Affective Computing. The most active institutes in the area can be taken from a list (incomplete, focused only on the most relevant venues) compiled by Mark Nelson.
A large part of the progress of the last years is due to the free availability of competition environments as: StarCraft, GVGAI, Angry Birds, Hearthstone, Hanabi, MicroRTS, Fighting Game, Geometry Friends and more, and also the more general frameworks as: ALE, GGP, OpenSpiel, OpenAIGym, SC2LE, MuJoCo, DeepRTS.
10 The future of Game AI
More advanced AI techniques are slowly finding their way into the game industry and this will likely increase significantly over the coming years. Additionally, companies are more and more collaborating with research institutions, to bring the latest innovations out to the industry. For example, Massive Entertainment and the University of Malta collaborated to predict the motivations of players in the popular game Tom Clancyâs The Division melhart2019your. Other companies, such as King, are investing heavily in deep learning methods to automatically learn models of players that can then be used for playtesting new levels quickly gudmundsson2018human.
Procedural content generation is already employed for many mainstream games such as Spelunky (Mossmouth, LLC, 2013) and No Man’s Sky (Hello Games, 2016) and we will likely see completely new types of games in the future that would be impossible to realise without sophisticated AI techniques. The recent AI Dungeon 2 game (www.aidungeon.io) points to what type of direction these games might take. In this text adventure game players can interact with Open AI’s GPT-2 language model, which was trained on 40 gigabytes from text scraped from the internet. The game responds to almost anything the player types in a sensible way, although the generated stories also often lose coherence after a while. This observation points to an important challenge: For more advanced AI techniques to be more broadly employable in the game industry, approaches are needed that are more controllable and potentially interpretable by designers zhu2018explainable.
We predict that in the near future, generative modelling techniques from machine learning, such as Generative and Adversarial Networks (GANs) goodfellow2014generative, will allow users to personalise their avatars to an unprecedented level or allow the creation of an unlimited variety of realistic textures and assets in games. This idea of Procedural Content Generation via Machine Learning (PCGML) summerville2018procedural, is a new emerging research area that has already led to promising results in generating levels for games such as Doom giacomello2018doom or Super Mario volz2018evolving.
From the current perspective, we would expect that future research (next to playing better on more games) in Game AI will focus on these areas:
AI/human collaboration and AI/AI agent collaboration is getting more important, this may be subsumed under the term team AI. Recent attempts in this direction include e.g.: Open AI five raiman2019long, Hanabi Bard2019, capture the flag jaderberg2019human
More natural language processing enables better interfaces and at some point free-form direct communication with game characters. Already existing commercial voice-driven assistance systems as the Google Assistant or Alexa show that this is possible.
The previous points and the progress in player modeling and game analysis will lead to more human-like behaving AI, this will in turn enable better playtesting that can be partly automated.
PCG will be applied more in the game industry and other applications. For example, it is used heavily in Microsoft’s new flight simulator version that is now (January 2020) in alpha test mode. This will also trigger more research in this area.
Nevertheless, as in other areas of artificial intelligence, Game AI will have to cope with some issues that mostly stem from two newer developments: theory-light but very successful deep learning methods, and highly parallel computation. The first entails that we have very little control over the performance of deep learning methods, it is hard to predict what works well with which parameters, and the second one means that many experiments can hardly ever be replicated due to hardware limitations. E.g., Open AI Five has been trained on 256 GPUs and 128,000 CPUs OpenAI_dota for a long time. More generally, large parts of the deep learning driven AI are currently presumed to run into a
It is definitively desired to apply the algorithms that successfully deal with complex games also to other application areas. Unfortunately, this is usually not trivial, but some promising examples already exist. The AlphaGo approach that is based on searching by means of MCTS in a neural network representation of the treated problem has been transfered to the chemical retrosynthesis problem segler2018planning that consists of finding a synthesis path for a specific chemical component as depicted in Fig. 4. As for the synthesis problem, in contrast to playing Go, the set of feasible moves (possible reactions) is not given but has to be learned from data, the approach bears some similarity to MuZero schrittwieser2019mastering. The idea to learn a forward model from data has been termed world program segler2019world.
Similarly, the same distributed RL system that OpenAI used to train a team of five agents for Dota 2 berner2019dota, was used to train a robot hand to perform dexterous in-hand manipulation andrychowicz2020learning.
We believe Game AI research will continue to drive innovations in the world of AI and hope this review article will serve as a useful guide for researchers entering this exciting research field.
We would like to thank Mads Lassen, Rasmus Berg Palm, Niels Justesen, Georgios Yannakakis, Marwin Segler, and Christian Igel for comments on earlier drafts of this manuscript.
- email: email@example.com
- email: firstname.lastname@example.org
- see http://www.dagstuhl.de/12191, http://www.dagstuhl.de/15051, http://www.dagstuhl.de/17471, http://www.dagstuhl.de/19511
- We are aware that this division is a bit simplistic, of course players can be also modeled online or for supporting the design phase. Please consider this a rough guideline only.
- In the game context, churn means that a player who has played a game for some time completely stops playing it. This is usually very hard to predict but essential to know especially for online game companies.