Join our Waitlist for Expert Advice!

India's Road Map Towards Building Foundation LLMs in India Building LLMs presents several challenges--the need for extensive computing power, large datasets, and specialized expertise

By Paromita Gupta

You're reading Entrepreneur India, an international franchise of Entrepreneur Media.

Freepik

India has recorded significant growth in Artificial Intelligence in the past two years after the Large Language Model (LLM) ChatGPT became the talk of the town and took the world towards the AI race. However, India is a very unique country with different cultures, with thousands of dialects, and several unique languages spoken across the country. This beauty of the country sometimes leads to challenges in implementing new innovative policies and the process consumes a lot of resources and time.

Leveraging Open Source for Building Scalable Foundation Models in India

While sharing insights on how India can leverage best practices of open source to build foundational models and LLMs, Pratyush Kumar, Co-founder, of Sarvam AI said that this technology is very new and would take time to better and become more useful. Building LLMs presents several challenges--the need for extensive computing power, large datasets, and specialized expertise. Open source in democratizing AI technology is essential and presents unique challenges and opportunities. India has done well, but still, it has a long journey ahead for these technologies to mature and become more universally useful.

"And that's where open source as a movement in computing has been very strong. And kudos to the government of India. I think the Bhashni project has been a big success in demonstrating how to do open-source Indian language AI at scale," said Kumar.

The Complexity of Indian Languages

While speaking on what innovative approaches and technologies India can use to address the challenges of building the foundation model MohitSewak, AI researcher and developer relations, South Asia, NVIDIA said that India's linguistic diversity is immense, with 23 official languages, over 10,500 unique dialects, and 123 unique languages. LLMs, like GPT, currently support only up to 100 languages with a tokenizer vocabulary size of 254,000. However, the diverse linguistic landscape of India requires models with an even larger tokenizer vocabulary to handle the multitude of languages and dialects effectively.

"That means we are talking about tens of trillions of tokens of data across these languages if we want a real Indian LLM that can actually do the type of tasks that we expect it to do," said Sewak.

Alignment with Cultural Sensitivities: Need Insider's View

While addressing if present language models accurately represent Indian cultural nuances and traditions other than English Dr.Kalika Bali, principal researcher at Microsoft said that LLMs currently possess what can be described as an "outsider's view" of culture. While these models are not entirely ignorant of cultural contexts, their understanding can be superficial.

Indian culture and sensitivities are vastly different from those of the Western world, where most current models are trained. To create effective models for India, it is crucial to incorporate alignment techniques that make models more attuned to Indian cultural nuances.

"I do not think that we can ever have a bias-free system. We can only hope to mitigate the bias as far as possible," Bali further added.

Public-Private Partnerships Is Needed

While speaking on how various stakeholders can collaborate in the journey of making Bharat GPT, Professor Ganesh Ramakrishnan began by highlighting the role of public-private partnerships in the Bharat GPT initiative. The project is supported by the Department of Science and Technology under the NMICPS program with several IITs and IIMs.

He feels that India needs more skilled people so that the open-source culture can be facilitated. Also, the importance of algorithmic innovation, particularly in resource-constrained settings given the limited availability of data across Indian languages, innovative algorithms can play a crucial role in optimizing the use of available data. "A lot more can be done there, and that's where, through this academic-industrial collaboration," said Ramakrishnan.

Measuring the Impact of AI Solutions in India

While addressing the impact of AI solutions and their measurement parameters, Shalini Kapoor, chief technologist APJ, AWS said that it will be defined a lot by Indian citizens because they are going to use it and usage comes only when there is a need. One of the primary metrics is the business value derived from them. This includes immediate benefits as well as long-term sustainability.

Another metric is AI cost-effectiveness includes not only the initial investment but also ongoing operational costs. "People don't have that much time and energy, cost, effort to waste," said Kapoor. Also, a successful AI solution should integrate multiple components rather than relying solely on LLMs. Mitigating bias and ensuring the ethical use of AI are essential metrics.

All the speakers shared their views at the Global India AI Summit 2024.

Paromita Gupta

Entrepreneur Staff

Features Writer with Entrepreneur India

Covering news and trends in AI and Metaverse segments. An avid book reader running her personal blog on the side. You may reach me at paromita@entrepreneurindia.com. 
Side Hustle

In Her Late 30s, She Pursued Another Creative Side Hustle — Then Turned It Into a Multimillion-Dollar Business

Gara Post had built one successful celebrity-magnet business before, so she decided to do it again.

Technology

Nasdaq-listed PTC to Invest $100 Million Annually to Ramp Up Business in India

PTC has a significant presence in India and expanding further. It currently has about 2,500 employees in India out of its global headcount of 7,500.

Growing a Business

Entrepreneurs Need to Develop These 5 Qualities to Be Successful

Being confident and willing to embrace calculated risks are just two of several qualities that can lead to entrepreneurial success.

Business News

Meta Fires Employee Making $400,000 Per Year Over a $25 Meal Voucher Issue

Other staff members were fired for the same reason, per a new report.

Business News

Tesla Reports 'Record' Earnings as Musk Predicts It Will Become 'the Most Valuable Company in the World'

Tesla also noted this week that it has produced seven million vehicles.