Teaching machines to speak Arabic

Special Teaching machines to speak Arabic
Short Url
Updated 08 November 2025
Follow

Teaching machines to speak Arabic

Teaching machines to speak Arabic
  • Innovation is helping AI understand the region’s language, culture, and voice

JEDDAH: As developers across the Arab world work to formalize Arabic for artificial intelligence — grappling with its many dialects, limited datasets, and deep cultural nuance — English-based AI systems have continued to surge ahead. Now, industry experts say it’s time for Arabic users to gain the same technological momentum.

The performance gap between Arabic and English natural language processing is most visible in speech recognition, where pronunciation, rhythm, and vocabulary differ sharply across dialects. These variations make it challenging for one model to understand spoken Arabic with consistent accuracy.

Despite these hurdles, progress is accelerating. With rising investment and government-backed initiatives led by Saudi Arabia and other regional powers, Arabic AI is steadily closing in on English in sophistication and accessibility.




As Arabic AI evolves, experts emphasize the importance of cultural nuance and dialect diversity in future language models. (aramcoworld.com)

Amsal Kapetanovic, head of KSA at Infobip, told Arab News: “While written NLP tasks like basic chatbots can be managed with additional work, speech recognition really exposes the limitations of current models. It requires even more fine-tuning and adaptation to handle the diversity of spoken Arabic effectively. This is where the gap between Arabic and English NLP is most pronounced.”

Infobip’s recent collaborations with telecom and private sector partners across the Gulf reveal a similar pattern: Arabic chatbots and virtual assistants often require greater oversight in their early stages than English systems. However, once they are retrained using region-specific conversational data and Gulf dialects, both accuracy and customer satisfaction rise sharply.

Arabic remains one of AI’s greatest linguistic challenges. Unlike English, it is not a single unified language but a family of dialects stretching from Asia to Africa. Its complex morphology — with prefixes, suffixes, gender and number agreement, and the absence of short-vowel diacritics — poses major obstacles for tokenization and model training.

Opinion

This section contains relevant reference points, placed in (Opinion field)

Kapetanovic referenced a 2025 study published in JMIR Medical Informatics (“InfectA-Chat: An Arabic Large Language Model for Infectious Diseases”), which tested instruction-tuned models like GPT-4 in both English and Arabic. The research found that Arabic models still trail English by 10–20 percent in complex tasks.

“Arabic models still lag slightly behind English ones, particularly in areas like accuracy and sentiment analysis,” he said. “This is primarily due to the smaller size of Arabic training datasets and the complexity of Arabic dialects.”

He added: “Arabic itself is a family of languages and dialects — much richer and more complex than many others. This diversity adds another layer of challenge.”




Amsal Kapetanović, head of KSA unit at Infobip. (Supplied)

Yet optimism remains strong. “The good news is that there is significant investment happening, especially in the MENA region, with countries like Saudi Arabia leading the way,” Kapetanovic said. “Initiatives like Vision 2030 are accelerating progress, and we’re seeing more focus on localizing AI for Arabic speakers.”

Speech recognition continues to represent the most visible gap. “A Lebanese speaker and a Saudi speaker might use different words and speak at different speeds, making it challenging for a single model to recognize and process spoken Arabic accurately,” he said.

Localization, Kapetanovic explained, extends far beyond translation. “At Infobip, we are defining the evolution of communications in co-creation with our customers and partners throughout the region. Gartner has recognized us as a Leader in their 2025 Magic Quadrant for CPaaS. We are committed to delivering the next generation of AI-powered customer conversations to unlock seamless, high-impact engagement for MENA businesses. That’s why we put a strong emphasis on localizing our AI-driven platforms and tools to serve Arabic-speaking users effectively.”




Technical, cultural, and ethical challenges shape the future of Arabic AI, as developers strive for inclusion and linguistic parity. (aramcoworld.com)

Real-world applications are already bearing fruit. “For example, Nissan Saudi Arabia rolled out a WhatsApp chatbot (‘Kaito’) that handles customer queries in both Arabic and English,” he said. “These bots leverage Infobip’s Answers platform, which includes built-in NLP capabilities for Arabic — such as right-to-left text support and Arabic stop-word recognition — to interpret queries and intent.”

“For Saudi Arabia and the Gulf, we’ve gone beyond simple translation by implementing features and partnerships tailored to the region,” he continued.
“We’ve partnered with Lucidia, a leading Saudi tech company, to co-develop solutions that address local business needs and integrate with popular regional channels like WhatsApp and X.”
“We’ve also built language models that recognize Gulf-specific dialects and cultural expressions, making our chatbots and automation tools more intuitive for users. Additionally, our platform supports local payment integrations and business workflows unique to the region. These initiatives reflect our commitment to delivering genuinely localized technology, not just Arabic language support.”

DID YOU KNOW?

• Saudi Arabia is leading investment in Arabic AI, with Vision 2030 initiatives.

• AI can become biased and exclusionary if it does not speak or understand Arabic well.

• Infobip’s Arabic chatbots now ‘think’ in Gulf dialects, improving accuracy.

Cultural understanding, he added, is key to truly human-like AI. “Culturally aware AI should ideally be AI that understands the why behind the what,” he said. “It’s about deep research and understanding the background — not just giving straight answers to straight questions.”

“At Infobip, we integrate with multiple large language models and do so in an agnostic way,” he said. “We combine them and see which ones serve which purpose, giving us the flexibility to avoid pitfalls like AI hallucination or unwanted replies.”

The ethics of language and inclusion

Kapetanovic cautioned that neglecting Arabic in AI development poses not only technical risks but ethical ones.

“The ethical risk is that AI can become biased and exclusionary if it doesn’t speak or understand Arabic well,” he said. “If AI systems don’t handle certain languages or dialects properly, or if they lack enough regional data, they can exclude parts of the narrative or reinforce bias.”

“It’s essential for everyone in the AI ecosystem to contribute to making AI as inclusive and democratized as possible. Otherwise, we risk reinforcing disparities in services, information, and opportunities.”
 

 


‘A Paperless Event’ – the slogan of Saudi technology at the UN General Assembly for Tourism

‘A Paperless Event’ – the slogan of Saudi technology at the UN General Assembly for Tourism
Updated 07 November 2025
Follow

‘A Paperless Event’ – the slogan of Saudi technology at the UN General Assembly for Tourism

‘A Paperless Event’ – the slogan of Saudi technology at the UN General Assembly for Tourism

RIYADH: Papers are absent, and Saudi technology is present to say “a paperless event” at the UN General Assembly meetings for the tourism sector, which will be held in Riyadh, with the participation of more than 100 ministers from around the world, Al-Eqtisadiah reports.

The assembly meetings are set amidst natural green plants cultivated in the Saudi desert, surrounding the roundtable that will bring the ministers together. They will chart their plan and vision for the next 50 years, discuss the use of artificial intelligence in the global tourism sector, and ensure the human element is not marginalized.

Sara Al-Saud, the general supervisor of International Affairs for the Saudi Ministry of Tourism, said that “there is a shortage of an estimated 43 million workers in the global tourism sector.”

She clarified that the topic of AI will be one of the subjects discussed by the over 100 ministers, in addition to shaping the Assembly’s vision for the next 50 years.

She added that the Assembly meetings are expected to witness the signing of memorandums of understanding and agreements during the event, alongside a number of recommendations that will be announced in due course.

For his part, Ahmed Al-Ghamdi, the director-general of International Research and Planning, emphasized that the human element is very important in the tourism sector, and that artificial intelligence significantly helps small and medium enterprises improve their service quality and customer experience.

The Executive Director of UN Tourism, Natalia Bayona, explained that the global tourism sector is the largest employer of youth, with 60 percent of them working with AI. She added that many tourists worldwide use AI to explore tourist destinations.

Consequently, a survey was conducted with member states to ascertain if they have local AI strategies and to identify what support could be offered to develop the mechanism, especially since the tourism sector relies heavily on small and medium enterprises.

Globally, the tourism sector contributed 10 percent to the global gross domestic product in 2024, equivalent to $10.9 trillion, recording a growth rate of 8.5 percent compared to 2023, thereby surpassing pre-COVID-19 pandemic levels.

On the local front for the Saudi tourism sector, unprecedented levels were recorded in terms of visitor numbers, spending volume, job creation, and contribution to the GDP.

The direct and indirect contribution of the tourism sector to the GDP reached 11.5 percent in 2023. The International Monetary Fund predicts that the Saudi tourism sector will achieve a growth rate of 16 percent by 2034.