ARTICLE
21 October 2025

From Exclusion To Inclusion: AI, Language And Access To Justice In South Africa

E
ENS

Contributor

ENS is an independent law firm with over 200 years of experience. The firm has over 600 practitioners in 14 offices on the continent, in Ghana, Mauritius, Namibia, Rwanda, South Africa, Tanzania and Uganda.
At the J20 summit, Justice Mlambo, Deputy Chief Justice posed a critical question: in a multilingual society, are litigants still excluded if they do not speak the colonial language of record? South Africa's...
South Africa Technology
Linda Sheehan’s articles from ENS are most popular:
  • within Technology topic(s)
  • in United States
ENS are most popular:
  • within International Law, Environment and Finance and Banking topic(s)

At the J20 summit, Justice Mlambo, Deputy Chief Justice posed a critical question: in a multilingual society, are litigants still excluded if they do not speak the colonial language of record? South Africa's legal system must confront a dual reality. On the one hand, the Constitution guarantees parity of esteem and treatment for all twelve official languages. On the other, the courts have, since 2017, adopted English as the language of record for reasons of practicality and uniformity.

South Africa recognises 12 official languages, namely Sepedi, Sesotho, Setswana, siSwati, Tshivenda, Xitsonga, Afrikaans, South African Sign Language, English, isiNdebele, isiXhosa and isiZulu. While isiZulu and isiXhosa are the most widely spoken at home, English remains the pre-eminent language of business and legal record. In practice, this means accused persons and witnesses may testify in the language of their choice, but the official record is in English. The intent is administrative clarity; the effect is a translation burden and the risk of exclusion for those whose legal realities are expressed most precisely in another tongue.

This balancing act is not unique to South Africa. Jurisdictions worldwide grapple with ensuring access to justice for non-native speakers as they standardise court records for efficiency and interoperability. The immediate challenge is operational: producing reliable, timely translations without losing nuance. The strategic opportunity is technological: using responsible artificial intelligence ("AI") to narrow, rather than widen the language gap.

Are indigenous languages linguistically excluded from AI transcription and translation?

Justice Mlambo's intervention at the J20 Summit highlighted a pivotal question for access to justice: do AI tools enable courts to operate credibly across languages, particularly for indigenous language speakers?

According to UNESCO, the African continent is home to a third of the world's languages with, "an estimated 1,500 to 3,000 languages, Africa is a true hub of linguistic and cultural wealth".

The "Mind the (Language) Gap" white paper notes that most major large language models ("LLMs") underperform for non-English, and especially low-resource languages, and are not attuned to relevant cultural contexts. Many African indigenous languages are considered low-resource because they lack sufficient digital data, annotated datasets and computational tools needed for effective AI-powered translation and speech recognition. This scarcity makes it difficult to train and deploy language technologies, resulting in digital underrepresentation for speakers of these languages. Conversely, the so-called "curse of multilinguality" means that as more languages are added to a model, performance for each may decline, especially for those with limited digital resources.

LLMs trained predominantly on English data are often biased in favour of native English speakers, meaning low-resource language communities are at risk of being left behind in the AI revolution.

Local is lekker

A growing body of international research highlights the transformative potential of AI in supporting and revitalising indigenous languages. As noted by Forus International in May 2023, "Chat GPT for example knows the capital of Kenya but when being asked in Kinyarwanda (one of the official languages of Rwanda), it does not understand the question". Whilst AI models are continuously improving, Forus highlighted the importance of domain specific data sets and the challenge posed for those languages that are spoken but not written.

The NGLUEni project benchmarks and improves the performance of pretrained language models for South African Nguni languages, isiXhosa, isiZulu, isiNdebele and siSiswati, four of South Africa's official languages. By creating a unified evaluation framework and adapting models specifically for those linguistically related, under-resourced languages, the study demonstrates significant gains in accuracy and cross-lingual transfer. This work supports greater linguistic inclusion in digital tools and AI systems.

African Innovation and Collaboration

Recent developments on the continent demonstrate African innovation and collaboration to address the digital language divide and overcome the challenges in linguistic underrepresentation in AI. As reported by iAfrica in September 2025, a "landmark AI dataset" has been created by African researchers to "close the language gap and boost digital inclusion."

Linguists, computer scientists and AI experts across Kenya, Nigeria and South Africa, funded by a USD2.2 million Gates Foundation grant, created AI-ready open-access datasets representative of real African speech patterns. This critical resource will enable translation, transcription and conversational AI tools to better understand and process the way people live, speak and interact on the African continent.

The world's largest language barrier lifted by AI

At the J20 summit, Luís Roberto Barroso, Justice of the Supreme Federal Court of Brazil, considered that language is one of the most important benefits that artificial intelligence will bring to the world.

This holds particular importance when handling cross-border legal matters. It is essential to incorporate a framework for the responsible and effective use and integration of AI and language technologies into everyday legal practice.

Crossing linguistic borders in litigations and investigations with AI

Language is no longer a hindrance in litigation, investigations, regulatory responses and any other document-heavy matters.

Generative AI tools can analyse millions of documents involving multiple languages in a matter of days. Specialised tools, such as Relativity's aiR for Review, when prompted with case-specific criteria can summarise, classify and pull together the evidence, regardless of language.

This approach not only accelerates the identification of critical evidence but also provides transparent reasoning and direct links to supporting evidence, which is invaluable for validation of the generative AI review as well as organising the findings for cross-examination and report writing. Further, it removes the human bias inherent in the search phase of document reviews when, in multilingual matters, search criteria for certain languages is often curated by non-native speakers creating gaps in the identification of crucial information.

By combining the speed and scalability of AI with human oversight, legal teams can manage cross-border, multilingual matters more efficiently, reduce risk and ensure that critical evidence is not lost in translation.

Co-piloting in-house legal practice with AI translation and transcription

In-house legal teams across Africa routinely operate in English, French, Portuguese and a multitude of local languages. AI translation and transcription can accelerate contract review and drafting, localise compliance policies, enable multilingual knowledge bases and streamline communications across jurisdictions. Employees are more likely to internalise policies and training when they are conveyed in their first language, increasing the likelihood of operational compliance rather than "paper compliance".

As organisations build translation memories and vetted glossaries, quality and speed improve over time, creating a resilient multilingual knowledge infrastructure. However, adoption requires robust governance. Legal exposure can be created through mistranslations or omissions in high-stakes documents, confidentiality risks from unvetted tools and inconsistent terminology resulting in misalignment across offices. The prudent approach is a structured, dual-pass process: machine translation for scale, followed by subject-matter expert review, supported by controls for privacy, privilege, confidentiality and data security.

Call to action: legal standards for linguistic inclusion

AI is reshaping legal work, from court transcription to document review. When designed for local contexts and deployed responsibly, AI-powered transcription and translation can turn linguistic barriers into vectors for inclusion and more equitable outcomes. Sustainable progress depends on legal professionals who are AI-literate, insist on equality of treatment across languages and support ongoing investment in local-language datasets, domain-specific fine-tuning and governance frameworks that protect confidentiality and accuracy.

Legal professionals should set the tone: embrace innovation, insist on equality.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More