TLoL: Human level in League of Legends using Deep Learning (Grok)

Table of Contents
Introduction
Feasibility?
- Existing LLM Game Playing Systems
Existing Work
- League of Legends (Patch v4.20)
- League of Legends (Season 11 - 13)
Proposal

Introduction

Recently there has been a massive development for League of Legends game playing AI…

Let’s see if @Grok 5 can beat the best human team @LeagueOfLegends in 2026 with these important constraints:

1. Can only look at the monitor with a camera, seeing no more than what a person with 20/20 vision would see.

2. Reaction latency and click rate no faster than human.…
— Elon Musk (@elonmusk) November 25, 2025

Elon Musk announcement of League of Legends challenge for Grok 5

Elon Musk has issued a public challenge to beat the worlds best League of Legends team by the end of 2026 using Grok 5. There are also two very interesting restrictions for the challenge:

Can only look at the monitor with a camera, seeing no more than what a person with 20/20 vision would see.
Reaction latency and click rate no faster than human.

The second criteria is to avoid the issue of the bot simply out “APMing” humans, in the sense that the bot has higher mechanical precision and reaction speeds and can therefore mathematically calculate the perfect way of avoiding damage. Most would agree this isn’t a display of interesting behaviour, but rather a “hydraulic press” way of beating humans.

However the first criteria is more interesting, that the input into the bot is simply just the raw visual feed from the game. This departs from other frontier game playing systems such as OpenAI Five and AlphaStar which were provided raw data from the game using an API. This means that Grok 5 will also have to learn how to process visual information, segment different objects in view and many other complex visual processing skills. Presumably Grok will also have the same viewport restrictions which humans have, so the bot would also have to pan the camera over to areas of interest as well?

OpenAI Five Architecture

Above we can see the architecture of OpenAI Five, the architecture of the system which beat the world champions at Dota 2 in 2019. The section in blue is dedicated purely to processing each of the individual units in the game and directly integrates complex spatial-visual information, along with semantic information per each unit. Grok 5 will have to learn to process similar visual information for League of Legends.

Feasibility?

The key question for this project is, within the next year is it feasible that a visual large language model (vLLM) such as Grok 5 could genuinely beat the best League of Legends team in the world? Let’s assess this question by asking what beating the worlds best team actually means. Grok 5 will presumably have to control five agents within League of Legends, one for each role (Top, Jungle, Mid, ADC and Support) with the aforementioned restrictions. Aside from that, the system has no other restrictions.

So, has this type of thing been done before? Yes, multiple MOBA-style games have had AI-based systems beat professional teams, ranging from Dota 2 (OpenAI Five), Honor of Kings - which is a game very similar to League of Legends (JueWu-SL), etc.

However, the nature of Grok 5 being the agent along with the purely visual input changes things drastically. Some contributors on X (formerly Twitter) have already built out environments using Claude 4.5 Opus (currently SOTA for LLM-based agentic control systems) to play League of Legends.

Can computer-use models play games now, one-shot?

I gave Claude Opus 4.5 a simple prompt like "play league of legends" and it starts clicking and typing around my computer pretty effectively even though it doesn't win due to latency

More interestingly between Minecraft, finding… pic.twitter.com/jSXf6E6a95
— Surya Dantuluri (@sdand) November 26, 2025

Credit: Surya Dantuluri (@sdand)

From this we can already see some interesting strengths and weaknesses from this SOTA (state-of-the-art) system:

Mechanical precision: there is a clear lack of mechanical precision in the movement orders issued by Claude
Object segmentation: the system is struggling to precisely identify different objects in the view
Speed: the main limitation is the lack of a real-time control loop due to the time-to-first-token and the time taken by claude to generate the text based responses which encode the reasoning and the final action prediction

However there are some impressive trends:

Learning: the agent is learning from its experience in real time. It’s adjusting its policy in real time based on feedback
Generalisation: despite having not been trained to play league of legends, it is somewhat successfully controlling the player and issuing commands

Existing LLM Game Playing Systems

This vein of work follows a growing body of work aimed at using LLMs to play games as a test of their utilisation of world-knowledge, on-going learning abilities and general curiosity, such as:

Gemini plays Pokemon - Gemini 3 Pro playing Pokemon Crystal
Claude plays Pokemon - Claude Opus 4.5 playing Pokemon Red
Minecraft Voyager - LLM-based agent learns to play minecraft by adapting python policy based on game feedback

The common theme among these projects is:

Vision: the pokemon related projects use purely vision inputs into Gemini and Claude as input
Motor control: The systems expose a simpler action paradigm for the agents (aside from Voyager) which is essentially tool use. However for Minecraft Voyager, it also has a simpler set of basic actions, however the agent learns to flexibly compose these into hierarchies of complex tool use to be able to accomplish the in-game achievements such as acquiring diamonds

So between the existing LLM agentic control projects (Pokemon, Voyager, etc.), the existing Opus 4.5 League of Legends agent and history of successful MOBA game playing AI systems, it stands to reason that producing a bot which can play League of Legends is entirely feasible. However this brings up the next question, can Grok 5 learn to beat the world champions at League of Legends, and what would be required to achieve this?

Existing Work

To answer this, we need to understand what has and hasn’t worked in prior League of Legends game playing AI research, and why those limitations matter for the Grok 5 challenge. For a comprehensive overview of other existing projects, refer to TLoL - Part 1

The first place to look would be existing work into League of Legends game playing AI. My previous work in this domain centered on a project called TLoL. This was an attempt at producing human level League of Legends AI. Ultimately that project concluded prematurely due to the sheer difficulty of the task and the nature of its execution.

League of Legends (Patch v4.20)

pylol - League of Legends patch 4.20 reinforcement learning environment
pylol-demo - First open-source League of Legends RL environment, successfully trains a basic agent using PPO in Google Colab!
lolgym - Adversarial multi-agent system trained to outperform hardcoded bot and humans in a 1v1 environment (from Mustafa Eyceoz and Justin Lee at from Columbia University)

League of Legends (Season 11 - 13)

TLoL RL - League of Legends Season 13 reinforcement learning environment
tlol-py - Replay scraping orchestration, conversion of extracted datasets into datasets for analysis and model training, etc.
TLoL - League of Legends season 11 to 13 game playing AI datasets (10k+ games)
tlol-llm - Very early attempts at analysing League of Legends games using LLMs to test understanding
League Data Scraping - Packet-based approach at producing League of Legends training datasets at scale (in contrast to tlol-py, from Henry Zhu at University of Illinois Urbana-Champaign)

However each strand of research here has particular issues. The patch 4.20 research benefits from an offline simulator of the game which allows hyper-scaling of training, but alas is very limited. Only one champion, Ezreal, is implemented properly and it lacks any support for minions, can be slightly buggy in general, etc.

As for the TLoL set of tools, they were matched to the actual live game patches but suffered from the issue of League of Legends active attempts to encrypt and obfuscate the game. This meant that the offsets of the game (which the underlying scripting engines depended on) needed to be determined to be able continue to use the replay analysis tools, and for the RL environment. This lead to an enigma-style effort to constantly update the offsets (akin to the allies having to re-crack the enigma cipher every day to intercept German military communications). Since the release of Vanguard (Riot Games kernel-level anti-cheat tool), this has made the development of scripting based systems for League of Legends harder.

This means that any serious attempt to train Grok 5 to conquer League of Legends would be far easier to accomplish directly with the aide of Riot Games.

Proposal

For scaling, it would be most useful if Riot Games either provided an offline version of the League of Legends game server or at least an API to instances running on Riot’s end along with an API to control agents within these games. This would work similar to the Bot Scripting API for Dota 2 from Steam, or an offline version which is then exposed via a convenient API like PySC2 for Starcraft would also suffice.

Table of Contents