Skip to content

Meta Unveils Code World Model: AI That Generates and Understands Code

The new AI model from Meta can generate code from descriptions and analyze algorithm complexity. Its strong benchmark results hint at its potential in code generation and understanding.

There is a poster in which there is a robot, there are animated persons who are operating the...
There is a poster in which there is a robot, there are animated persons who are operating the robot, there are artificial birds flying in the air, there are planets, there is ground, there are stars in the sky, there is watermark, there are numbers and texts.

Meta Unveils Code World Model: AI That Generates and Understands Code

Meta has unveiled the Code World Model (CWM), a new AI designed to generate code and understand its execution. The model, developed by OpenAI and released as GPT-5-Codex on September 15, 2025, has shown impressive performances on various benchmarks.

CWM's training involved three phases. Initially, it learned basic programming concepts. Then, it observed over 120 million Python program executions, tracking variable changes and states. Finally, it honed its skills through reinforcement learning, tackling complex tasks.

The model's capabilities are evident in its benchmark results. On HaltEval, it achieved 94% accuracy in simulating program behavior. In reasoning mode on CruxEval Output, it scored 94.3%. It also excelled on LiveCodeBench (68.6%), Math-500 (96.6%), and AIME math Olympiad (76.0%).

CWM can work backwards too, deriving code from program descriptions and expected results. It can analyze algorithm complexity and predict time complexities, ranking second on BigOBench leaderboard. While it outperformed many smaller open-source models on SWE-bench Verified, it was surpassed by Qwen3-Coder.

Meta has published CWM as an open-weights model under a non-commercial research license on Hugging Face. Despite being outperformed on some benchmarks, CWM's strong performances across various tests demonstrate its potential in code generation and understanding. Its ability to analyze algorithm complexity and work backwards from program descriptions adds to its utility.

Read also:

Latest