Computing and the future of South Asian languages

June 9, 2019

For decades, the efforts involved in planning and implementation of software localization was far greater than the use of localized software for South Asian locales. Today, despite the increase in use of localized software, the percentage of users in South Asia, who use localized software remains at a fraction of the total user base. What is even more daunting is that a large chunk of users use the Latin Alphabet to use transliterated version of their native languages. This proliferation of transliteration has resulted in the decline of native language computing. As a result,a rising number of both technologists and users have started to find non latin computing to be cumbersome. In the long run, this trend can have a negative impact on the overall future of these languages.

Computing in South Asia has historically been dominated by the English language. One of the key reasons that attributed to this was that in the early days of the Internet, the bulk bulk majority of content was in the English language. In the early 2000s, a wide set of initiatives were undertaken, throughout the region in developing better support for their languages. Among them, the Pan localization project was a pioneer in helping start the localization software as well as input methods in several South Asian languages. In Nepal, the project was responsible for working with local organizations in releasing the first localized Linux distribution dubbed Nepalinux. Around the timeframe it was released, Nepalinux was well received by the Nepali media and hobbyists like myself at the time. However, with the lack of a community and a governance model to support its development, the project later died a slow death. To add insult to injury,at the time of this writing, in 2019, its website, nepalinux.org fails to return any content.

The demise of Nepalinux as well as the slowed momentum for the push of localized open source software coincided with two major changes in how people used computers. The first one was the rise of social media, and the second was how people switched from desktop devices to mobile phones. In a short timespan ranging from late 2000s to the early 2010s, the bulk of digital content creation and consumption was happening on mobile devices. Unfortunately, the early versions of both the social media platforms as well as the mobile devices lacked a smooth way to create content in several local languages. As of result of this, most users of such devices and platforms created and consumed content using the Latin alphabet. However, the user of the Latin alphabet was a way to create and consume content in the local languages. Later, even with the arrival of non latin keyboards, due to the fact that the latin keyboards were better supported as well as the users’ familiarity with the latin based content creation, the non latin keyboards failed to penetrate beyond a niche population. The result was that most digital content created in such languages used the latin alphabet. In other words, what it meant was that rather than writing in the Nepali sentence म भात खान्छु in the Devanagari script, most users prefer writing in the sentence as Ma bhaat khanchu . This lack of simpler input methods for content creation coupled with the gravitation towards the use of the latin alphabet in creating content meant that the overall digital footprint of local scripts has not seen any significant growth.

As the penetration of social media and mobile device continues in the region, it is important that software makers continue to improve the support content creation, content search and content consumption mechanism in local languages using their own native scripts. As software makers, it is important that we take a step in the right direction so that in the next decades to come, our future generations would not miss out the opportunity to digitally search and consume the treasure of literature available in their native languages.