Many Companies Are Launching Misleading "Open" AI Models — Here's Why That's Dangerous for Entrepreneurs Truly open models will level the playing field for AI innovation.
By Raghavan Muthuregunathan Edited by Micah Zimmerman
Key Takeaways
- There is no clear definition of truly and completely open-source LLM models.
- A standardized Model Openness Framework is needed.
- Truly open models accelerate innovation. With this, entrepreneurs can prototype ideas and bring AI products to market faster.
Opinions expressed by Entrepreneur contributors are their own.
The field of AI is rapidly advancing. Large companies continue to launch new foundational models. Yet, there is no clear definition of an entirely open AI model. Many models claim to be "open," but only a subset of components are released open and use restrictive licensing for the rest. This creates a spectrum of partial openness. For example,
- one might publish a model's architecture and weights but not the training data and code.
- one might release the trained weights under a license that prohibits commercial use or restricts derivative work,
- or one might release the trained weights in a non-restrictive license but the code in a restrictive license.
This ambiguity around what is truly "open" hinders the progress of AI adoption, creating products and services for the end user. It creates legal risks for entrepreneurs who may inadvertently violate the terms of partially open models. We need a clear framework for assessing the nature of model openness. Such a framework should help AI entrepreneurs, researchers and engineers to make informed decisions about which models to use, build derivate work upon and make contributions to.
An example
Let us consider a hypothetical AI startup called "yet-another-chat-bot." They are developing an AI chatbot to improve customer support responses. They leveraged a hypothetical pre-trained language model named "llam-stral" to accelerate the development. The authors of "llam-stral" have published a paper on arXiv describing the architecture and performance. They have made the trained weights available for download.
The engineers of "yet-another-chat-bot" use "llam-stral" in their prototype for the chatbot but later find that the license explicitly prohibits commercial use and creation of derivative works. Also, the training data and code used for training have not been released. They are now exposed to legal risks and potential IP infringement issues.
The right thing to do would have been to have "llam-stral" adhere to the Model Openness Framework and use a standard open license like Apache 2.0 for the code and CC-BY-4.0 for the weights and dataset. It would have been crystal clear to the startup "yet-another-chat-bot" to use it commercially and build on top of it.
There is a need for a framework that defines the completeness and openness of models for effective reproducibility, transparency and usability in AI. Leveraging something like the Model-Openness framework published by GenAICommons would be useful for both model creators and consumers in understanding what the key artifacts, which of them are open and which are not, are. A completely open model would release all the components, including training data, code, weights, architecture, technical report and evaluation code, all in permissive licenses.
Related: Scarlett Johansson Asks Why ChatGPT Sounds Like Her
Components of an AI model
By releasing all the artifacts and components associated with a large language model under permissive licenses, creators can claim that their models are genuinely and completely open. This promotes transparency, reproducibility and collaboration in the development and application of large language models
Some of the essential components are as follows :
- Training Data: The dataset used to train the large language model.
- Data Preprocessing Code: The code used for cleaning, transforming and preparing the training data.
- Model Architecture: The design and structure of the AI model, including its layers, connections and hyperparameters.
- Model Parameters: The learned weights and biases of the trained AI model.
- Training Code: The code used for training the AI model, including the training loop, optimization algorithm and loss functions.
- Evaluation Code: The code used for evaluating the performance of the trained AI model on validation and test datasets.
- Evaluation Data: The dataset used for evaluating the performance of the trained AI model.
- Model Documentation and Technical Report: Detailed documentation of the AI model, including its purpose, architecture, training process and performance metrics. The academic paper or a technical report that describes the AI model, its methodology, results, and contributions to the field.
The more the artifacts that are open and licensed permissively, the more open the model.
Related: OpenAI And Meta Models Will Soon Have 'Reasoning' Capabilities
Truly open models accelerate innovation
Access to genuinely open AI models levels the playing field for AI entrepreneurs and helps unleash innovation. They would leverage state-of-the-art models and datasets instead of building every component from scratch. This would help them prototype ideas faster and validate performance, expediting the market time.
Instead of spending time and resources reinventing the wheel and recreating baseline capabilities, AI Entrepreneurs can now focus on domain-specific challenges and identify ways of adding value. The open licenses used by models conforming to the Model Openness Framework (MOF) also provide confidence that entrepreneurs can legally use the models in commercial products and services.
There will be no worries about the risk of IP infringement claims or sudden changes to licensing terms. Access to entire training data and code under non-restrictive licenses helps entrepreneurs audit the model's provenance, ensuring compliance with regulations.
Furthermore, an engineer can examine the datasets for potential biases. Developers would be able to find performance bottlenecks and improve performance since they would have access to the entire codebase. This can help port the model to different environments and improve maintenance over time. Thus, entirely open models reduce the barriers to building AI-powered products and services and move the needle of innovation.