Falcon-H1 Arabic Is Leading the April 2026 Arabic LLM Board, and Jais, ALLaM, and Fanar Are Still Responding
The Arabic LLM race has produced a surprise leader in April 2026, and the benchmarks are not close. TII's Falcon-H1 Arabic family, released in January 2026, has taken the top of the Open Arabic LLM Leaderboard across every size class it competes in. The flagship 7B model posts an average of 71.47%, the 3B comes in at 61.87%, and the 34B version reaches 75.36%. That is the tidy version. The scrappier truth is that Jais, ALLaM, Fanar, and Mawdoo3 are still formulating public responses, and the gaps are widening.
Falcon-H1 is the model that changed the board
The Technology Innovation Institute in Abu Dhabi rebuilt Falcon for Arabic using a hybrid Mamba-Transformer architecture. The efficiency gains are what actually matter. Falcon-H1 Arabic 7B is beating models in the 10-billion-parameter class, and the 34B version is beating 70B+ rivals. In a region where compute is politically charged and cost-sensitive, pushing performance up per parameter is the right bet, because it reduces the inference-time bill for enterprise buyers.
The development of Falcon-H1 Arabic builds on years of foundational work in Arabic AI and responds directly to the needs of our communities, unlocking new possibilities in education, healthcare, and governance.
Specific benchmark highlights for Falcon-H1 Arabic 7B include 64.85% on Arabic MMLU, 52.89% on Exams, 48.79% on MadinahQA, 85.36% on AraTrust, and 63.71% on ALRAGE. The 3LM STEM reasoning and AraDice dialect-understanding scores are also step-change improvements over the 2024 Falcon-Arabic baseline.
By The Numbers
- 71.47% average Open Arabic LLM Leaderboard score for Falcon-H1 Arabic 7B
- 75.36% average for Falcon-H1 Arabic 34B, beating 70B+ rivals
- 61.87% for Falcon-H1 Arabic 3B, the smallest current leaderboard entrant
- 85.36% AraTrust score for Falcon-H1 Arabic 7B, leading the trustworthiness benchmark
- 62.57% prior OALL v2 average for Falcon-Arabic-7B-Base, now clearly superseded
Where Jais, ALLaM, and Fanar stand
Jais, the G42, Inception, and MBZUAI collaboration, has not shipped a public April 2026 refresh that closes the gap. ALLaM, under SDAIA and now woven into HUMAIN's commercial stack, published the HUMAIN ALLaM 7B version last year but has not publicly leap-frogged Falcon-H1 in recent benchmarks. Fanar from QCRI, represented today as Fanar-1-9B, continues to hold leaderboard presence but trails Falcon-H1 7B on the headline average. Fanar Star and Fanar Prime extend the family into multimodal language-plus-speech-plus-image tasks, but have not been directly benchmarked on the OALL scoreboard yet.
Mawdoo3 is a separate category. The Mawdoo3 team built an Arabic content and search empire long before the modern LLM wave, but does not run an openly benchmarked foundation model that appears on leaderboards. The company's AI assets are embedded in products and data services rather than competing for public benchmark slots.
The positioning map has shifted
The old map of Arabic AI put Jais and ALLaM at the top, with Falcon as a strong but less-Arabic-focused contender. Falcon-H1 has redrawn that map. The January 2026 release was not just a model drop. It was a strategic signal that the UAE intends to dominate Arabic LLM performance the way it has dominated sovereign-AI fundraising via G42 and MGX.
This model reflects our focus on building Arabic AI that is not only more advanced, but genuinely useful in real-world settings.
| Model | Owner | Latest public metric | April 2026 status |
|---|---|---|---|
| Falcon-H1 Arabic 7B | TII (UAE) | 71.47% OALL avg | Leaderboard leader |
| Falcon-H1 Arabic 34B | TII (UAE) | 75.36% OALL avg | Above 70B+ class |
| Fanar-1-9B | QCRI (Qatar) | Below 7B Falcon | Multimodal push via Fanar Star/Prime |
| ALLaM 7B | SDAIA / HUMAIN (KSA) | Below 7B Falcon | Commercial integration in HUMAIN stack |
| Jais | G42 / Inception / MBZUAI (UAE) | No April 2026 refresh | Next version anticipated |
| Mawdoo3 | Mawdoo3 (Jordan) | Not benchmark-listed | Product-embedded AI |
What this means for buyers
For enterprise AI buyers across MENA, the Falcon-H1 lead is both an opportunity and a complication. It gives procurement teams a defensible argument for choosing Falcon-H1 as the base model, especially for regulated Gulf deployments where a UAE-sovereign stack is politically easier than one from outside the region. At the same time, it puts Saudi, Qatari, and Jordanian-linked alternatives under pressure to respond. The broader MENA AI harmonisation push could end up favouring whichever sovereign stack reaches the most useful cross-border performance threshold first.
Three buyer-side observations for procurement teams:
- Falcon-H1 Arabic 7B is already good enough for most regulated Gulf enterprise cases
- Fanar's multimodal push matters if speech and image workloads are in the roadmap
- ALLaM's strength is distribution via HUMAIN and Saudi public-sector tenders rather than raw benchmarks
Our dialect benchmark coverage from earlier this week and the deeper Arabic NLP research landscape explain where the research community is pushing next. The short version is that dialect, multimodal, and long-context work are where the next leaderboard shuffle will happen.