Skip to main content
Arabic AI

The Arabic LLM scoreboard in April 2026: Falcon, Jais, and ALLaM jockey for first place

Falcon-H1 Arabic, Jais, and ALLaM are rewriting what an Arabic LLM can do. Here is the April 2026 scoreboard, benchmark by benchmark, launch by launch.

· Updated Apr 18, 2026 6 min read
The Arabic LLM scoreboard in April 2026: Falcon, Jais, and ALLaM jockey for first place
## The Arabic LLM scoreboard in April 2026: Falcon, Jais, and ALLaM jockey for first place Arabic large language models have gone from research trophy to production utility in less than two years, and the April 2026 scoreboard finally shows an Arabic-first model taking first place on the **Open Arabic LLM Leaderboard**. **Falcon-H1 Arabic**, from the **Technology Innovation Institute** in Abu Dhabi, leads the pack at 34 billion parameters, ahead of Meta's Llama 3.3 70B and Alibaba's Qwen2.5 72B. Behind Falcon, **Jais** from **MBZUAI**, **G42**, and **Cerebras** is expanding its family, and Saudi Arabia's **ALLaM** continues to push the national model agenda. This is what the Arabic LLM race actually looks like. ## The scoreboard, explained The Open Arabic LLM Leaderboard, or OALL, evaluates models on Arabic reasoning, reading comprehension, exam benchmarks, and dialect handling. Falcon-H1 Arabic now holds first place at 34B, with a 7B variant close behind in its weight class. Jais family models still dominate certain long-context and chat benchmarks, especially for dialog-heavy workloads. ALLaM, maintained by SDAIA and Saudi Aramco, is the highest-profile sovereign model and continues to anchor Saudi public-sector deployments. Beyond these three, SADA AI in Egypt, Atlas models from Morocco, and Apple's Arabic Siri upgrades make the ecosystem richer than any single leaderboard can capture. ### By The Numbers - Falcon-H1 Arabic 34B now ranks first on the Open Arabic LLM Leaderboard, ahead of Llama 3.3 70B and Qwen2.5 72B. - Three Falcon-H1 Arabic sizes released: 3B, 7B, and 34B parameters, each open-weight. - Jais family now includes 13B, 30B, and 70B models, tuned for both Modern Standard Arabic and Gulf dialects. - ALLaM is deployed in more than 150 Saudi government use cases, according to SDAIA briefings. - MBZUAI reports more than 5,000 downstream derivative models built on Jais weights globally. The Arabic LLM scoreboard in April 2026: Falcon, Jais, and ALLaM jockey for first place ## What Falcon-H1 actually got right Falcon-H1 Arabic succeeded by rethinking three things at once. First, the training data mix was rebalanced so that high-quality Modern Standard Arabic sat alongside carefully filtered Egyptian, Gulf, and Levantine dialects rather than being treated as noise. Second, the tokeniser was redesigned to handle Arabic morphology without penalising common prefix and suffix patterns that tokenisers trained on English routinely mangle. Third, post-training included targeted reinforcement on the exact benchmark families that OALL tests, which is legitimate competition engineering even if it also means the gains have to be stress-tested in production. > "Today, AI leadership is not about scale for the sake of scale. It is about making powerful tools useful, usable, and universal." > — Faisal Al Bannai, Secretary General, Advanced Technology Research Council, United Arab Emirates > "The next frontier is not another benchmark leaderboard, it is whether an Arabic model can replace a Saudi call centre workflow at production SLAs." > — Dr. Talal Al-Shammari, public sector AI lead, Riyadh ## Where Jais and ALLaM still win Jais is the most widely adopted open-weight Arabic model in real deployments. Its long-context variants are the default behind several Gulf bank chat platforms, and its tuning for dialog makes it the model of choice for customer service bots that need to keep conversations on-brand. ALLaM is the vehicle that Saudi Arabia uses to signal sovereign seriousness. Each model family has its niche, and most serious production stacks in the Gulf already blend two or three of them. Our earlier coverage of the [Arabic NLP 2026 community research MENA scene](/arabic-ai/arabic-nlp-2026-community-research-mena) captures the academic and open-source side of this story, and the [MENA AI startup map 2026](/startups/mena-ai-startup-map-2026) shows how fast dependent startups are spinning up.
Model familyKey sizesBest atOwner
Falcon-H1 Arabic3B, 7B, 34BBenchmarks, open-weight flexibilityTII, Abu Dhabi
Jais13B, 30B, 70BProduction dialog, long contextMBZUAI, G42, Cerebras
ALLaM7B, 13B, 40BSovereign Saudi public-sector useSDAIA, Aramco
Atlas Chat9BMoroccan Darija, North African tasksMBZUAI, Imperial
SADA AIMid-sizeEgyptian Arabic legal and financeEgyptian consortium
## The production tests that matter OALL is useful, but production customers care about four things. Can the model write a Modern Standard Arabic legal summary without hallucinating article numbers? Can it answer a Gulf-Arabic chat query in the same dialect without reverting to book Arabic? Can it process Egyptian Arabic banking documents with real numbers and dates? Can it run on the compute budget a regional bank or ministry is prepared to burn every month? Falcon is winning the first two increasingly often, Jais is winning the third in many domains, and ALLaM is winning the fourth inside Saudi public-sector rooms where sovereignty carries a premium. 1. Legal and regulatory summarisation across Arabic speaking markets. 2. Dialect-fluent customer service in Gulf, Egyptian, and Levantine flavours. 3. Arabic OCR plus structured extraction for government and banking flows. 4. Voice-to-Arabic text at production latency for call centres. 5. Tool calling with Arabic-named actions, which is where most 2026 production wins will land.
The AI in Arabia View: The Arabic LLM race is no longer about who publishes the biggest model. It is about who makes the next Arabic call centre, clinic, and classroom work without English in the loop. Falcon-H1 Arabic taking the top of the OALL leaderboard is a genuine milestone and gives the UAE bragging rights, but Jais still owns real estate inside production chatbots, and ALLaM owns the sovereign playbook in Riyadh. The smart move for any MENA buyer is to stop picking favourites and start building stacks that route specific tasks to specific models, using open weights when they can, and paid APIs only when the workload truly justifies it.
## Frequently Asked Questions ### Which Arabic LLM is ranked first right now? Falcon-H1 Arabic at 34 billion parameters currently ranks first on the Open Arabic LLM Leaderboard, ahead of Meta's Llama 3.3 70B and Alibaba's Qwen2.5 72B. The model is open-weight and comes from the Technology Innovation Institute in Abu Dhabi, with smaller 3B and 7B variants for more constrained deployments. ### How do Jais and ALLaM compare to Falcon? Jais, developed by MBZUAI, G42, and Cerebras, is the most widely deployed open-weight Arabic model for production chat and long-context workloads. ALLaM, from SDAIA and Aramco, is the primary sovereign model inside Saudi Arabia's public sector. Each wins in specific categories, which is why many production stacks mix all three. ### What about Egyptian, Moroccan, and Levantine Arabic? Dialect support is the fastest-moving part of the scoreboard. **SADA AI** is leading on Egyptian Arabic for legal and finance use cases. **Atlas Chat**, backed by MBZUAI and Imperial College London, is the most advanced on Moroccan Darija. Levantine support has improved sharply across Falcon and Jais in the last year. ### How should a MENA enterprise pick an Arabic LLM? Start with the workflow, not the model. Pick benchmarks that reflect the real task, run head-to-head tests on internal data, and compare cost per million tokens against production SLAs. Expect to route different workflows to different models, rather than standardising on one brand across the business. Which Arabic LLM deserves the top spot on your shortlist for 2026, and which one would you quietly leave out? Drop your take in the comments below.