Mark Zuckerberg Unveils LLaMA, Meta's Powerful New Large Language Model Large tech companies and startups are racing to develop products with integrated advanced AI.
By Steve Huff
On Friday, Facebook co-founder Mark Zuckerberg announced Meta Platforms' impending release to researchers of a new large language model called LLaMA (Large Language Model Meta AI). The model, developed by Meta's Fundamental AI Research (FAIR) team, is intended to aid scientists and engineers in exploring AI applications and functions such as answering questions and summarizing documents.
The release of LLaMA comes as tech companies race to promote advances in AI techniques and integrate technology into their commercial products. As CNBC notes, Meta's release is distinguished from competitors' models as it will be available in a selection of sizes, from 7 billion parameters up to 65 billion parameters. Additionally, Zuckerberg said his company's new LLM technology — which could eventually solve math problems and conduct scientific research — will be available to the research community, and Meta is now accepting applications for access. This is a change from Google's LaMDA and ChatGPT's underlying models, which are not publicly available.
Reuters points out that Meta is joining an increasingly intense race to dominate AI technology, which began in earnest in late 2022 with OpenAI's ChatGPT. As far as Meta is concerned, LLaMA's launch also represents its commitment to open science — hence the choice to publicly release the state-of-the-art foundational large language model, along with allowing researchers an open resource to advance their work. Meta believes that unlike more finely-tuned models designed for specific purposes, theirs will prove versatile, with multiple use cases.
Another way LLaMA is different, according to Meta: It requires "far less" computing power than previous offerings and is trained in 20 languages, focusing on those based on the Latin and Cyrillic alphabets. With its 13 billion parameters, LLaMA should outperform GPT-3, the model upon which ChatGPT is built. Meta also attributed LLaMA's performance to "cleaner" data and "architectural improvements" in the model that improved training stability.
To maintain the model's integrity and prevent misuse, Meta will release it under a non-commercial license focused on research use cases. Academic researchers, government, civil society, academic institutions, and industry research laboratories will be granted model access on a case-by-case basis.
Meta's launch of LLaMA may mark a major development in AI language models. The social media giant's commitment to open science and allowing researchers to study under a non-commercial license will limit the model's misuse.
LLaMA's versatility and problem-solving potential may provide a glimpse of AI's substantial potential benefits to billions of people at scale.