Gemini, O1, Grok, Claude, Llama, Yi, and Mistral: The number of large language models (LLMs) seems to have grown exponentially since OpenAI’s ChatGPT burst into the public consciousness in late 2022. It’s estimated that $154 billion was spent on AI by businesses in 2023, while the most recent McKinsey global survey on AI reveals that up to 65% of firms were regularly using generative AI (gen AI) in their operations. But beyond the hype and billion-dollar figures, it’s still a little unclear where and how the technology is really being used.
ADVERTISEMENT |
Our research aimed to go beyond surveys of business managers and included real-world data on how AI is being adopted. To achieve this, we went to the source to analyze the activities of software developers and how they were interacting with some of the leading open-weight LLMs. These are freely available LLMs that can be accessed, modified, and redistributed with minimal restrictions.
We found significant variations in the adoption of AI based on geography, company size, and industry. Perhaps unsurprising, startups and technology firms were the largest users, with U.S. companies way ahead when it comes to embracing the potential of AI. Educational institutions were another key locus of AI activity, with a particularly high adoption of Llama in this segment. What also emerged was the domination of the U.S., with Mark Zuckerberg’s Llama and Elon Musk’s Grok way ahead of other open-source LLMs in market share.
Open access data
Our search for concrete usage data led us to the popular developer platform GitHub, an important repository for AI code and other resources. This includes a number of notable open-weight LLMs such as Grok, Meta’s Llama, Yi by the Chinese group 01.AI, and Mistral by its eponymous French AI startup.
The beauty of GitHub is the ability to identify which developers have downloaded (or “forked”) which LLM code to power AI applications. Therefore, it offers insights into what LLMs are proving popular among developers. Because developers are the ones pushing technological boundaries, it perhaps offers insights into AI trends.
Equally important, the open nature of GitHub also allowed us to identify a decent percentage of those developers’ country of residence, employer sector, and employer size. While not an exact science, our data provide a snapshot of how developers from different countries, company sizes, and industries are adopting LLMs. We benchmarked AI adoption numbers against the use of TensorFlow, one of the most widely used machine learning developer tools, released by Google back in 2015.
Industry and company differences in adoption
With AI adoption at a relatively early stage, we find significant variation across different sectors. As you can see from Table 1, technology firms lead the way (48.3% of forks), but the education industry also accounts for a relatively high percentage of adoptions of LLMs (26.3%).
The strong showing of educational institutions reflects the extent of research around LLMs and gen AI. At INSEAD, we certainly see deep interest among our faculty and students to understand LLMs’ capabilities and limitations. Given the extraordinary costs involved in developing modern LLMs, researchers benefit from the availability of these open-weight models rather than relying on the proprietary offerings from the likes of OpenAI and Anthropic.
We found a much lower uptake of LLMs in traditional sectors, especially those producing and selling physical goods—a reminder of the important role of tech and higher education in driving innovation in the economy.
We expected that smaller startups would be at the vanguard of open AI adoption. After all, they are traditionally more agile and therefore much quicker to adopt new technology than larger firms, especially if it might give them a significant market advantage. Although startups do lead the way, there’s considerable activity across all size categories. This contrasts with the TensorFlow benchmark, where large companies are the biggest users by far.
Regional differences
Not surprisingly, North America was the dominant location for developer LLM activity on GitHub, with just over 50% of the forks originating from there. Nonetheless, the dominance in LLMs is less than TensorFlow, where North America has more than 60% of the forks. In Table 2, we highlight those geographical differences by breaking down our data by region.
While it was mostly consistent between regions, there were notable differences. Startups have the leading share of forks in every region except North America, where the largest companies (those with more than 10,000 employees) hold the leading share. This could well reflect the greater maturity of AI adoption in the U.S. and Canada, where firms have had longer to get to know and understand the benefits this technology can bring to an organization.
The domination of U.S. LLM models is perhaps more striking. Llama has an outsized share across all the regions, with Grok firmly in second spot. Our data doesn’t seem to suggest that developers express any regional loyalties toward local LLMs, such as Mistral in Europe.
Musk vs. Zuckerberg
Llama is clearly benefiting from its decision to embrace an open-weight strategy for its LLMs early on, starting with the release of the weights for Llama 2 in November 2023. Llama’s leadership position also benefits from its backing by the Meta brand. As Merouane Debbah (distinguished AI researcher and leader of the team that developed Falcon LLM model in Abu Dhabi) puts it, “Developers need to have confidence in the staying power of an open model for them to build their applications on top of it.”
With his Grok models, Musk is looking to challenge Meta for leadership in the open-weight segment while also taking on former collaborator Sam Altman, the CEO at OpenAI. Since the release of Grok 2 in August 2023, Musk’s xAI has adopted a hybrid strategy where its latest model remains private, though the company publishes the weights to its prior models.
It’s interesting to note Grok’s strong performance despite being late to the game, especially in China and the rest of the world. Its close association with X may help, as might the star appeal of Musk. It will be interesting to see whether his new role with the U.S. government will affect adoption levels in the future.
Beyond region, our data (see Table 3) suggest other differences in the user base of these two models. Grok has a slightly heavier weight in the tech sector, while Llama seems more popular among those in the education sector. Finally, Llama is slightly more popular among large organizations, while Grok is more associated with startups.
Musk, who described Grok as a “maximum truth-seeking AI,” has previously commented on his desire for tough regulations around AI. Meanwhile, Zuckerberg has made numerous statements about the need for open-source AI to become the industry standard. It will be interesting to see the effect of their LLMs going head to head on the future of the open-weight approach to gen AI.
Looking ahead
We’re still in the early days of developing adopting LLMs. Expect to be surprised. Divining the future is highly challenging, given the intense competition among the top players, not to mention the effect from geopolitical pressures. Will a more assertive Trump administration affect the ability of the U.S. to dominate gen AI?
Our analysis, rooted in actions by real-life developers, provides insights into the current state of play. Importantly, it gives us a benchmark for future analysis and the opportunity to spot trends in developers’ preferences for specific LLMs.
We’re particularly interested in following the adoption dynamics behind open-weight models. Will they be able to compete effectively with fully proprietary models? Might Meta or xAI shift away from their commitment to open weights? Currently, they’re facing restrictions from data access to cost, but they seem to have the greatest opportunities for significant innovation.
Co-founder of Near.ai and co-author of the landmark paper “Attention Is All You Need,” Ilia Polosukhin says, “The future of Al should be open and accessible to everyone. As developers continue to push the boundaries of what’s possible with this technology, permissionless, accessible models will be the foundation upon which new breakthroughs are built.”
The picture will undoubtedly change and develop over time. We hope that this will enable us to further map adoption trends and provide better indicators of where we might be headed on this revolutionary AI journey.
Published Dec. 19, 2024, by INSEAD.
Add new comment