Meta releases Llama 3.2 – and gives its AI a voice

Powering the new Meta AI capabilities is an enhanced version of Llama, Meta’s flagship large language model. The free model announced today could also have a broad impact, given how widely the Llama family has already been adopted by developers and startups.

Unlike the OpenAI models, Llama can be downloaded and run locally without charge – although there are some limitations for large-scale commercial use. A Llama can also be more easily fine-tuned or modified with additional training for specific tasks.

Patrick Wendell, co-founder and vice president of engineering at Databricks, a company that hosts AI models including Llama, says many companies are attracted to open models because they allow them to better protect their own data.

Large language models are increasingly becoming “multimodal”, meaning they are trained to handle audio and images as input as well as text. This expands the capabilities of the model and allows developers to build new types of AI applications on top of it, including so-called AI agents capable of performing useful tasks on computers on their behalf. Llama 3.2 should make it easier for developers to create AI agents that can, say, surf the web, perhaps looking for deals on a certain type of product, when given a brief description.

“Multimodal models are a big deal because the data that people and businesses use is not just text, it can be in many different formats, including images and audio, or more specialized formats like protein sequences or financial records,” says Philippe Isola , a professor at MIT. “Over the past few years, we’ve moved from strong language models to models that work well on images and voices as well.

“With Llama 3.1, Meta has shown that open models can finally close the gap with their proprietary counterparts,” said Nathan Benaich, founder and general partner of Air Street Capital and author of an influential annual report on AI. Benaich adds that multimodal models tend to outperform larger text models. “I’m excited to see how 3.2 takes shape,” he says.

Earlier today, the Allen Institute for AI (Ai2), a research institute in Seattle, released an advanced open-source multimodal model called Molmo. Molmo was released under a less restrictive license than Llama, and Ai2 is also releasing details of its training data, which can help researchers and developers experiment with and modify the model.

Meta said today that it will release several sizes of the Llama 3.2 with corresponding capabilities. In addition to two more powerful instances with 11 billion and 90 billion parameters—a measure of the model’s complexity as well as its size—Meta releases less capable versions with 1 billion and 3 billion parameters designed to work well on portable devices. Meta says these versions are optimized for ARM-based mobile chips from Qualcomm and MediaTek.

Meta’s AI overhaul comes at a tumultuous time when tech giants are racing to offer the most advanced AI. The company’s decision to release its most valuable models for free could give it an edge in providing the foundation for many AI tools and services — especially as companies begin to explore the potential of AI agents.

Related Posts

Leave a Reply Cancel reply