Skip to main content
News

Falcon, Jais, and ALLaM: The Three Models Defining Arabic AI

A deep technical and strategic comparison of TII's Falcon-H1 Arabic, G42/Inception's Jais 2, and SDAIA's ALLaM - the three sovereign language models competing to give half a billion Arabic speakers an AI that understands their language.

· Updated Apr 17, 2026 16 min read
Falcon, Jais, and ALLaM: The Three Models Defining Arabic AI

Three Nations, Three Models, One Language - and a Race That Will Define Arabic AI for a Generation

In the global contest to build artificial intelligence that actually understands human language, the Arabic-speaking world has largely been an afterthought. English dominates training corpora. Mandarin commands serious investment. But Arabic - spoken by over 422 million people across more than twenty countries, written in a script that runs right to left, fractured into over thirty dialects so distinct they can be mutually unintelligible - has until recently been treated as a problem too complex and a market too fragmented to prioritise.

That changed with the emergence of three flagship large language models, each backed by a different Gulf state, each reflecting a distinct strategic vision for sovereign AI. Technology Innovation Institute's Falcon, developed in Abu Dhabi. Inception's Jais, built through a UAE partnership with Cerebras Systems and MBZUAI. And SDAIA's ALLaM, Saudi Arabia's national Arabic model now powering the HUMAIN ecosystem.

Together, these three models represent the most significant investment in Arabic natural language processing in history. But they are not just technical artefacts. They are instruments of national strategy, expressions of sovereign ambition, and - depending on how the next few years unfold - potentially the foundation of an Arabic AI ecosystem that could serve half a billion people.

This is their story, compared head to head.

By The Numbers

  • Falcon-H1 Arabic 34B scores 75.36% on the Open Arabic LLM Leaderboard, outperforming models twice its size including Qwen2.5 72B
  • Jais 2 was pretrained from scratch on 2.6 trillion curated Arabic, English, and code tokens
  • ALLaM's enterprise variant reportedly reaches 1.8 trillion parameters, matching GPT-4o scale
  • The Falcon Foundation received an initial $300 million pledge to champion open-source AI
  • G42 has raised $2.3 billion in total funding, including $1.5 billion from Microsoft
  • HUMAIN plans to deploy 500 megawatts of compute capacity with several hundred thousand NVIDIA GPUs
  • More than 53 Arabic language models had been identified by Q1 2025, with the Gulf states leading development

Falcon: Abu Dhabi's Open-Source Champion

Technology Innovation Institute (TII), the applied research pillar of Abu Dhabi's Advanced Technology Research Council, has built the Falcon series into one of the most recognised open-source model families in the world. The journey began in 2023 with Falcon 40B - a 40 billion parameter model trained on 1 trillion tokens of the RefinedWeb dataset using 384 A100 GPUs over two months. It topped the Hugging Face Open LLM Leaderboard on release, outperforming Meta's LLaMA-30B and LLaMA-65B.

Falcon 180B followed, scaling to 180 billion parameters trained on 3.5 trillion tokens across 4,096 A100 GPUs. The training consumed approximately seven million GPU-hours on Amazon SageMaker, producing a model that scored 68.74 on the MMLU benchmark. In February 2024, TII launched the Falcon Foundation at the World Governments Summit with an initial pledge of $300 million - a non-profit entity dedicated to advancing open-source generative AI.

The Falcon 3 family arrived in December 2024, releasing 30 model checkpoints ranging from 1 billion to 10 billion parameters, trained on 14 trillion tokens of web, code, STEM, and curated multilingual data using 1,024 H100 GPUs. The family introduced Falcon3-Mamba-7B, a state-of-the-art State Space Language Model with 32,000-token context length.

But the real breakthrough for Arabic came on 5 January 2026, with the launch of Falcon-H1 Arabic.

Falcon-H1 Arabic: The Technical Details

Falcon-H1 Arabic employs a hybrid Mamba-Transformer architecture - a deliberate break from the pure transformer design that dominates the field. This hybrid approach combines the efficiency of state-space models for long sequences with the attention mechanisms that transformers excel at for contextual understanding. The model ships in three sizes: 3B, 7B, and 34B parameters.

The benchmark results are striking. On the Open Arabic LLM Leaderboard (OALL v2), the 3B model scores 61.87 per cent - ten full points ahead of Microsoft's Phi-4 Mini 4B. The 7B model reaches 71.47 per cent, surpassing all models in the approximately 10B parameter class, including Qatar's Fanar-1-9B and HUMAIN's ALLaM 7B. The 34B flagship scores 75.36 per cent, outperforming both Qwen2.5 72B and Llama-3.3 70B despite being roughly half their size., as highlighted by Saudi Data and AI Authority (SDAIA)

On specialised Arabic benchmarks, the picture is equally compelling. ArabCulture scores reach approximately 80 per cent for both the 7B and 34B variants. On the 3LM STEM benchmark, the 34B model achieves 96 per cent on native questions and 94 per cent on synthetic. The AraDice dialect evaluation shows coverage across Egyptian, Gulf, Levantine, and Maghrebi Arabic, with the 34B model averaging approximately 53 per cent across all dialect categories.

"We are not building models that process Arabic as a secondary capability. Falcon-H1 Arabic was designed from the ground up to understand the full complexity of the Arabic language - its dialects, its morphology, its cultural context." TII announcement, January 2026

Falcon-H1 Arabic is released under the Falcon License 2.0, an Apache 2.0-based permissive licence with an acceptable use policy. It is fully open source, available on Hugging Face, and free for commercial use with no hosting restrictions.

Jais: Speed, Scale, and the Cerebras Advantage

Inception, founded in late 2017 as the first dedicated AI research hub in the region and part of the G42 ecosystem, has taken a different path with Jais. Where Falcon emphasises architectural innovation and open-source reach, Jais has focused on training data quality, inference speed, and commercial deployment infrastructure.

For related analysis, see: [Egypt's Shift in AI Regulation](/news/egypts-shift-in-ai-regulation).

The original Jais 13B launched in August 2023, built on a Llama2 foundation with an expanded tokeniser that doubled the base vocabulary. It was trained on 116 billion Arabic tokens using the Condor Galaxy 1 supercomputer, developed in partnership with MBZUAI. Jais 70B followed in 2024, scaling to 70 billion parameters trained on 370 billion tokens - 330 billion of them Arabic, the largest Arabic dataset for any open-source model at the time.

G42's corporate trajectory provides essential context. The company raised $800 million from Silver Lake in April 2021, followed by Microsoft's landmark $1.5 billion investment in April 2024, with Microsoft's Brad Smith joining the board. Total funding reached $2.3 billion. In parallel, G42 launched partnerships with OpenAI to deploy advanced AI optimised for the UAE and broader region, and anchored the Stargate UAE project - a 1-gigawatt compute cluster.

Jais 2: The December 2025 Leap

Jais 2, announced on 9 December 2025, represents a ground-up rebuild. Developed by Inception in partnership with Cerebras Systems and MBZUAI's Institute of Foundational Models, it ships in two sizes - 8B and 70B parameters - pretrained from scratch on 2.6 trillion curated Arabic, English, and code tokens.

The standout metric is inference speed. Running on Cerebras hardware, Jais 2 achieves up to 2,000 tokens per second - a figure that transforms the economics of deploying Arabic AI at scale. For enterprise applications handling millions of Arabic-language queries daily - government services, banking, telecommunications - this speed advantage is not merely technical. It is commercial.

On the AraGen benchmark, Jais-2-70B achieves the highest scores across nearly all metrics, outperforming both Qwen2.5-72B and Llama-3.3-70B. The model excels particularly in culturally rooted domains: poetry, religion, cuisine, dream interpretation, translation, summarisation, and financial analysis. Its dialect coverage spans Modern Standard Arabic and regional variants, with specific engineering for code-switching and informal tone - the way Arabic is actually used in everyday digital communication.

Jais 2 is released with full open-source weights, available for download on Inception's Hugging Face repository, and deployed for production use through Azure AI Model Catalog.

ALLaM: Saudi Arabia's Sovereign Stack

If Falcon is the open-source champion and Jais is the speed-optimised commercial play, ALLaM is the sovereign infrastructure model - designed to serve Saudi Arabia's national AI ambitions and, through HUMAIN, to anchor an entire domestic AI ecosystem.

ALLaM is developed by SDAIA, the Saudi Data and Artificial Intelligence Authority, established by royal decree in August 2019. SDAIA oversees the National Strategy for Data and AI, which targets SAR 75 billion in investments by 2030. ALLaM is the linguistic foundation of that strategy.

The model family includes ALLaM 7B, available on Hugging Face, and ALLaM 34B, which launched on 25 August 2025 as the engine powering the HUMAIN Chat application. SDAIA has described it as "one of the leading large Arabic language models in the Arab world." An enterprise variant reportedly scales to 1.8 trillion parameters - matching OpenAI's GPT-4o in scale and positioning ALLaM as a government and enterprise-grade model rather than a consumer or developer tool.

For related analysis, see: [Revolutionising Customer Service Through AI in Middle East](/business/boost-loyalty-cut-costs-chatgpts-secret-weapon-for-customer-service).

What sets ALLaM apart is not its benchmark performance - on the Open Arabic LLM Leaderboard, the 7B variant trails both Falcon-H1 Arabic 7B and Jais 2 8B - but its integration into a national AI infrastructure. HUMAIN, the PIF-owned company that deploys ALLaM, has plans for 500 megawatts of compute capacity, 11 data centres each with 200-megawatt capacity, and an initial deployment of 18,000 GB300 GPUs. The SDAIA Hexagon data centre in Riyadh offers 480 megawatts of power across 2.78 million square metres.

"We want to be the third-largest AI provider in the world, behind the United States and China." Tareq Amin, HUMAIN CEO

ALLaM's licensing model reflects this sovereign posture. Unlike Falcon and Jais, which are fully open source, ALLaM is positioned for enterprise and government use within the HUMAIN ecosystem. This is a deliberate strategic choice: Saudi Arabia is building a vertically integrated AI stack - from compute infrastructure to foundation model to application layer - controlled domestically., as highlighted by UAE Artificial Intelligence Office

Head-to-Head: How They Compare

Dimension Falcon-H1 Arabic Jais 2 ALLaM
Developer TII (Abu Dhabi) Inception/Cerebras/MBZUAI (Abu Dhabi) SDAIA/HUMAIN (Riyadh)
Sizes 3B, 7B, 34B 8B, 70B 7B, 34B, 1.8T (enterprise)
Architecture Hybrid Mamba-Transformer Redesigned from scratch Not publicly detailed
Training Data 14T tokens base (Falcon 3), Arabic-adapted 2.6T curated Arabic/English/code tokens Not publicly disclosed
OALL v2 Score (best) 75.36% (34B) State-of-art on AraGen (70B) Trailing baseline (7B)
Inference Speed Standard 2,000 tokens/sec (Cerebras) Not disclosed
Dialect Coverage MSA, Egyptian, Levantine, Gulf, Maghrebi MSA + regional, code-switching Not specified publicly
Licence Falcon 2.0 (Apache-based, open) Full open-source weights Enterprise/government (HUMAIN)
Commercial Access Hugging Face, free Hugging Face + Azure AI Catalog HUMAIN ecosystem
Strategic Model Open ecosystem, global developer reach Speed-first, enterprise deployment Sovereign stack, domestic control

The Benchmarks That Matter for Arabic

Understanding these models requires understanding the benchmarks designed specifically for Arabic AI - a specialised evaluation ecosystem that has matured significantly since 2023.

The Open Arabic LLM Leaderboard (OALL v2) evaluates models across six multiple-choice tasks - including Arabic MMLU, Arabic Exams, Alghafa, MadinahQA, and Aratrust - plus one generative task (Alrage). It is the closest equivalent to the English-language Open LLM Leaderboard and the primary ranking system for Arabic models.

For related analysis, see: [Revolutionising the Future of Business with Generative AI](/business/revolutionising-the-future-of-business-with-generative-ai).

AraGen focuses on generative capabilities: translation, summarisation, financial analysis, and culturally rooted domains like poetry, religion, and cuisine. This benchmark captures something OALL misses - how well a model generates natural, contextually appropriate Arabic rather than simply selecting correct answers.

AraDice evaluates dialect and cultural understanding across Egyptian, Gulf, and Levantine Arabic. For any model claiming to serve the Arabic-speaking world, dialect performance is arguably the most important metric. A model that handles Modern Standard Arabic beautifully but fails on Egyptian colloquial is useless for most consumer applications.

BALSAM provides a comparative platform specifically designed for benchmarking Arabic LLMs, while ALUE (Arabic Language Understanding Evaluation) tests eight core language understanding tasks. Together, these benchmarks create a rigorous evaluation framework that did not exist three years ago - itself a sign of the Arabic AI ecosystem's maturation.

Three Strategies, Three Visions of Sovereignty

The technical comparison, while essential, only tells half the story. Falcon, Jais, and ALLaM embody three fundamentally different theories of how a nation builds AI sovereignty.

Falcon's theory is that sovereignty comes through influence. By releasing the world's best Arabic model as open source, TII ensures that every developer, startup, and government agency building Arabic AI applications starts from a UAE-originated foundation. The $300 million Falcon Foundation is not charity - it is an investment in ecosystem control through ubiquity. When a Jordanian fintech or an Egyptian healthtech startup fine-tunes Falcon for their use case, Abu Dhabi's AI influence extends without requiring any commercial agreement.

Jais's theory is that sovereignty comes through infrastructure partnerships. G42's web of relationships - Microsoft, OpenAI, Cerebras, Oracle - creates a commercial ecosystem where Jais is not just a model but a deployment platform. The Azure AI Catalog integration, the Cerebras inference acceleration, the Stargate UAE compute cluster - these ensure that Jais can be deployed at scale in enterprise environments. The UAE becomes indispensable not by giving the model away but by making it the fastest and easiest Arabic model to deploy in production.

ALLaM's theory is that sovereignty comes through vertical integration. Saudi Arabia is building every layer of the stack domestically - from Aramco's energy infrastructure powering the data centres, to HUMAIN's GPU clusters, to ALLaM's language capabilities, to the applications built on top. This approach sacrifices the developer ecosystem breadth that Falcon enjoys and the deployment flexibility that Jais offers, but it achieves something neither rival can claim: complete domestic control over the entire AI value chain.

The Geopolitical Dimension

These models exist within a geopolitical context that shapes their development as profoundly as any architectural decision. The UAE-Saudi AI competition is not merely commercial - it is an expression of each nation's vision for its post-hydrocarbon identity.

For related analysis, see: [Europe Takes the Lead into 2024: Sweeping New AI Rules Set G](/news/europe-takes-the-lead-into-2024-sweeping-new-ai-rules-set-global-standards).

The UAE has positioned itself as the open, partnership-driven AI hub. Microsoft's $15.2 billion commitment, OpenAI's Stargate UAE partnership, and G42's TIME100 recognition as one of the world's most influential companies reflect a strategy of embedding UAE AI infrastructure into the global technology supply chain so deeply that it becomes indispensable.

Saudi Arabia, through HUMAIN and its massive GPU deployment plans, is pursuing a more autonomous path. The $10 billion Google Cloud partnership and the strategic AI agreements signed with American technology companies during President Trump's 2025 visit provide access to cutting-edge hardware and know-how, but the ultimate goal is domestic capability. When HUMAIN's CEO declares an ambition to be the world's third-largest AI provider, he is articulating a national project, not a corporate strategy.

American technology policy adds another layer of complexity. The flow of advanced AI chips to the Gulf is now a matter of US national security policy, with export controls and bilateral agreements shaping which nations can access NVIDIA's latest hardware. Both the UAE and Saudi Arabia have navigated this landscape through strategic concessions on security and governance standards, but the dependency on American semiconductor supply chains remains a structural vulnerability for both nations' sovereign AI ambitions.

What Comes Next

The Arabic LLM landscape in early 2026 looks radically different from even eighteen months ago. Where there were once zero competitive Arabic-first language models, there are now three major families and over 53 Arabic models identified globally. The benchmarking infrastructure has matured. Commercial deployment pathways exist. Training data, while still inadequate for many dialects, has expanded dramatically.

Several developments will determine which of these three models - or which combination - comes to define Arabic AI for the next decade.

First, dialect coverage will separate the serious from the symbolic. A model that handles Modern Standard Arabic and Gulf dialect but fails on Egyptian, Levantine, or Maghrebi Arabic cannot claim to serve the Arabic-speaking world. Falcon-H1 Arabic's explicit five-dialect coverage is currently the most comprehensive, but the gap is narrowing.

Second, commercial deployment will matter more than benchmarks. The model that gets embedded in government services, banking platforms, healthcare systems, and e-commerce applications across the region will generate the data flywheel and developer ecosystem that sustains long-term dominance. Jais's Azure integration and Cerebras inference speed give it an advantage here. ALLaM's HUMAIN integration gives it a captive Saudi market.

Third, the open-source question will resolve. Falcon and Jais are both openly available. ALLaM is not. In the English-language AI world, the open-source versus closed-source debate has reshaped the competitive landscape. The same dynamics will play out in Arabic AI - and the resolution will determine whether Arabic AI development is concentrated in a few sovereign institutions or distributed across a broader developer ecosystem.

THE AI IN ARABIA VIEW: The emergence of Falcon, Jais, and ALLaM is not just a technical achievement - it is a geopolitical statement. For the first time, Arabic speakers have access to language models built specifically for their linguistic reality, backed by sovereign investment at a scale that ensures long-term viability. The competition between these three models is healthy. It drives innovation, expands dialect coverage, and creates commercial incentives to serve Arabic speakers rather than treating them as an afterthought. The risk is fragmentation - three walled gardens instead of one thriving ecosystem. The opportunity is that the Gulf's AI investment creates a foundation robust enough to support Arabic AI development for generations. At AI in Arabia, we will track every benchmark, every deployment, and every strategic shift in this race - because it is, quite literally, the race to give half a billion people an AI that speaks their language.

Sources & Further Reading

FAQ

Which Arabic language model is currently the best performer?

On the Open Arabic LLM Leaderboard (OALL v2), Falcon-H1 Arabic 34B leads with a score of 75.36 per cent, outperforming models twice its parameter count. On the AraGen generative benchmark, Jais-2-70B achieves the highest scores. Performance depends on the specific task and evaluation framework.

Are these models free to use?

Falcon-H1 Arabic and Jais 2 are both open source and free for commercial use. Falcon uses the Falcon License 2.0 (Apache 2.0-based), while Jais 2 provides full open-source weights. ALLaM is positioned for enterprise and government use within the HUMAIN ecosystem and is not openly available.

Which Arabic dialects do these models support?

Falcon-H1 Arabic explicitly covers Modern Standard Arabic, Egyptian, Levantine, Gulf, and Maghrebi dialects. Jais 2 supports MSA and regional dialects with engineering for code-switching. ALLaM's dialect coverage has not been publicly detailed.

How do these models compare to global models like GPT-4 or Claude on Arabic tasks?

On Arabic-specific benchmarks, the best regional models now outperform many larger global models. Falcon-H1 Arabic 34B surpasses Qwen2.5 72B and Llama-3.3 70B on OALL v2 despite being half their size. However, global models may still outperform on certain general-knowledge or reasoning tasks.

What is the significance of inference speed for Arabic AI deployment?

Jais 2's 2,000 tokens per second on Cerebras hardware is critical for commercial viability. Government services, banking, and telecommunications platforms handling millions of Arabic queries daily need fast, cost-effective inference. Speed determines whether a model can be deployed at national scale.

Why does Saudi Arabia take a different approach with ALLaM compared to the UAE's open models?

Saudi Arabia is building a vertically integrated sovereign AI stack through HUMAIN, prioritising domestic control over the entire value chain - from compute infrastructure to foundation model to application layer. The UAE favours open-source ecosystem influence (Falcon) and commercial partnership networks (Jais). Both approaches reflect different theories of AI sovereignty.

What should developers building Arabic AI applications choose?

For broad, open development: Falcon-H1 Arabic offers the widest dialect coverage and easiest access. For enterprise deployment requiring speed: Jais 2 with Cerebras integration. For Saudi government or enterprise contracts: ALLaM through the HUMAIN ecosystem. Many developers use multiple models for different components of their applications.

The Arabic LLM race is just beginning. Drop your take in the comments below.

## Frequently Asked Questions ### Q: How is the Middle East positioning itself in the global AI race?

Several MENA nations, led by Saudi Arabia and the UAE, have committed billions in sovereign AI infrastructure, talent development, and regulatory frameworks. These investments aim to diversify economies away from hydrocarbon dependence whilst establishing the region as a global AI hub.

### Q: What role does government policy play in MENA's AI development?

Government policy is the primary driver. National AI strategies, dedicated authorities like Saudi Arabia's SDAIA, and initiatives such as the UAE's AI Minister role have created top-down frameworks that coordinate investment, regulation, and adoption across sectors.

### Q: Why is Arabic natural language processing particularly challenging?

Arabic NLP faces unique challenges including dialectal variation across 25+ countries, complex morphology with root-pattern word formation, right-to-left script handling, and relatively limited high-quality training data compared to English.