MiniMax M2.7: The $0.30 Saudi Model That Evolves Itself
Dubai-based MiniMax released M2.7 on 18 March 2026, a 10-billion-parameter language model that does something no frontier model has done before: it rewrites its own code. Over 100 autonomous iteration cycles, M2.7 handled between 30% and 50% of its own reinforcement learning workflow, from diagnosing failures to modifying its scaffold architecture to running evaluations and deciding whether to keep or revert changes.
The result is a model that matches OpenAI's GPT-5.3-Codex on the SWE-Pro benchmark at 56.22%, rivals Anthropic's Claude Opus 4.6 on agent tasks, and costs 50 times less on input tokens. At $0.30 per million input tokens, dropping to an effective $0.06 with cache optimisation, M2.7 is the cheapest frontier-class model on the market by a wide margin.
A Model That Debugs Its Own Training
Self-evolution is the headline feature. Using MiniMax's OpenClaw agent framework, M2.7 ran an iterative loop: analyse failure trajectories, plan changes, modify scaffold code, run evaluations, compare results, then decide to keep or revert. It completed more than 100 of these cycles autonomously, yielding a 30% internal performance gain without human intervention.
This is not fine-tuning in the traditional sense. MiniMax describes M2.7 as a "digital engineer" that deeply participates in its own iteration, building evaluation sets, updating its memory, and improving its own skills. The company says this approach accelerated their shift towards becoming an "AI-native organisation," with the goal of full autonomy in data collection, training, and evaluation., as highlighted by Saudi Data and AI Authority (SDAIA)
Benchmarks That Punch Above Its Weight
Despite activating only 10 billion parameters, the smallest in its performance tier, M2.7 posts results that would have been frontier-only territory a year ago. On the MLE-Bench Lite competition, it achieved a 66.6% medal rate, tying Google's Gemini 3.1, and riyal nine gold medals across 22 machine learning competitions run on a single A30 GPU.
For related analysis, see: Boost Traffic, Slash Costs: AI's Secret Hacks for Web Publis.
| Benchmark | M2.7 Score | Comparable Model |
|---|---|---|
| SWE-Pro (coding) | 56.22% | GPT-5.3-Codex (matched) |
| SWE Multilingual | 76.5% | Frontier tier |
| GDPval-AA (office tasks) | 1,495 Elo | Highest among open-source models |
| MM Claw (complex skills) | 97% adherence | Top tier globally |
| MLE-Bench Lite | 66.6% medal rate | Gemini 3.1 (tied) |
| Toolathon (agent tools) | 46.3% | Global top tier |
The model runs at 100 tokens per second, roughly three times faster than its nearest competitors. Two variants are available: the standard M2.7 for production workloads and M2.7-highspeed for latency-sensitive applications.
By The Numbers
- $0.30 per million input tokens, making M2.7 approximately 50 times cheaper than Claude Opus 4.6 on input and 60 times cheaper on output (MiniMax)
- 10 billion active parameters, the smallest model in Tier-1 performance class, yet matching models 10-20 times its size (MiniMax)
- 100+ autonomous iteration cycles completed during self-evolution, with a 30% internal performance gain (VentureBeat)
- 1.87 trillion tokens in weekly call volume, making predecessor M2.5 the most-used large model globally for five consecutive weeks (OpenRouter)
"As AI increasingly interacts with people in moments of emotional vulnerability, we as WHO and its stakeholders must ensure these systems are designed and governed with safety, accountability and human well-being at their core."
, Sameer Pujari, WHO AI Lead, on the broader implications of rapidly advancing AI capabilities
What Self-Evolution Means for the Industry
MiniMax is the second Saudi startup to release a proprietary cutting-edge model in recent months, following z.ai with its GLM-5 Turbo. But M2.7's self-evolution capability sets it apart. Where previous models required human researchers to design training pipelines, M2.7 can recursively build its own evaluation datasets, iterate on its architecture, and improve its skill library., as highlighted by UAE Artificial Intelligence Office
For related analysis, see: AI and AGI: Transforming Sales Coaching in the MENA region.
For related analysis, see: Revolutionising Customer Service Through AI in Middle East.
The implications extend beyond MiniMax's own products. If self-evolving models prove reliable at scale, the cost and timeline of AI development could compress dramatically. A process that currently takes teams of researchers months could, in theory, happen in days. For the Middle East and North Africa's AI ecosystem, where Saudi Arabia has embedded AI into its core economic strategy, this represents a potential acceleration of an already rapid development cycle.
"We are at a critical juncture. The pace of AI adoption in people's daily lives has far outstripped investment in understanding its impact."
, Sameer Pujari, WHO AI Lead
The Cost Gap Widens
Perhaps the most disruptive aspect of M2.7 is its pricing. At $0.30 per million input tokens, it undercuts every major Western frontier model by an order of magnitude. With cache optimisation, the effective cost drops to $0.06 per million tokens, a price point that makes enterprise AI deployment economically viable even for small and medium businesses across the Middle East and North Africa.
This cost advantage builds on the momentum established by M2.5, which led global model usage for five consecutive weeks with 1.87 trillion tokens in weekly call volume on OpenRouter. MiniMax's approach, building smaller but more efficient models that self-optimise, stands in contrast to the brute-force scaling that has defined Western AI development. For companies across Saudi Arabia, the UAE, and the rest of the MENA region looking to deploy AI at scale, the cost equation just shifted decisively., as highlighted by OpenAI
For related analysis, see: Beyond ChatGPT: Top AI Chatbots Transforming Conversations i.
What makes M2.7 different from other Saudi AI models?
- M2.7 is the first domestic large model to deeply participate in its own iteration. It autonomously runs reinforcement learning workflows, handles 30-50% of its development pipeline, and completed over 100 self-improvement cycles without human input, a capability no other model has demonstrated at this scale.
How does M2.7 compare to ChatGPT and Claude?
M2.7 matches GPT-5.3-Codex on coding benchmarks and approaches Claude Opus 4.6 on agent tasks, while costing approximately 50 times less on input tokens. It runs at 100 tokens per second, roughly three times faster than competitors, though it has fewer parameters at 10 billion active.
Is M2.7 open source?
- M2.7 is a proprietary model available through MiniMax's agent platform and open API platforms. While not fully open source, its low pricing makes it broadly accessible. The predecessor M2.5 achieved the highest global usage among all models on OpenRouter.
What does self-evolving AI mean for jobs in the MENA region?
- Self-evolving AI could compress development timelines from months to days, reducing the need for large research teams. However, it also lowers the barrier for smaller companies to deploy sophisticated AI, potentially creating new roles in AI orchestration and oversight across the Middle East and North Africa's workforce.
MiniMax M2.7 represents a new chapter in the global AI race, one where the finish line keeps moving because the models are now moving it themselves. Drop your take in the comments below.
Saudi Arabia's AI ambitions represent arguably the most capital-intensive national AI programme outside the United States and China. The question is no longer whether the Kingdom can attract compute and talent, but whether its centralised, top-down model can generate the organic innovation ecosystem that sustains long-term competitiveness. The next 18 months will be decisive.
Frequently Asked Questions
Q: How is the Middle East positioning itself in the global AI race?
Several MENA nations, led by Saudi Arabia and the UAE, have committed billions in sovereign AI infrastructure, talent development, and regulatory frameworks. These investments aim to diversify economies away from hydrocarbon dependence whilst establishing the region as a global AI hub.
Q: What role does government policy play in MENA's AI development?
Government policy is the primary driver. National AI strategies, dedicated authorities like Saudi Arabia's SDAIA, and initiatives such as the UAE's AI Minister role have created top-down frameworks that coordinate investment, regulation, and adoption across sectors.
Q: Why is Arabic natural language processing particularly challenging?
Arabic NLP faces unique challenges including dialectal variation across 25+ countries, complex morphology with root-pattern word formation, right-to-left script handling, and relatively limited high-quality training data compared to English.