Meta AI Releases 2.4X Faster And 56% Sleeker Llama 3.2 Model
Date: October 25, 2024
Meta has taken a significant step towards providing faster and smaller AI models to broader audiences with better mobile device compatibility.
Meta AI is making significant upgrades to its Llama models to offer 2.4X faster processing of tasks while reducing up to 56% in model sizes. This effort is derived from the rising hardware processing requirements and lack of adequate power supply that demand sleeker AI models with equivalent performance capabilities. High energy costs, lengthy training times, and expensive semiconductors necessary for computational power can now be met with the release of the latest Llama 3.2 AI model.
The new AI model is built on two distinct techniques: Quantization-Aware Training (QAT) with LoRA adapters, which prioritizes accuracy, and SpinQuant, a state-of-the-art post-training quantization method to enable portability. The release includes downloadable formats for both versions.
The shrink-down of model sizes backed by powerful output capabilities will dramatically enhance research and business optimization efforts through cutting-edge AI technologies without needing specialized and costlier infrastructure.
Llama 3.2 has surpassed quality and safety industry benchmarks while achieving a whopping 2-4X faster processing speeds. The new AI model also achieved an average 56% reduction in size and 41% lesser memory usage compared to its previous BF16 format.
This advancement also improves compatibility with mobile devices, offering better features for mobile users within their hardware specifications. This reduction is powered by a technique that precisely reduces the model’s weights and activations from 32-bit floating-point numbers to lower-bit representations.
Meta AI also utilizes 8-bit and 4-bit quantization strategies, which reduce memory consumption and computational power demands while ensuring the retention of critical features of Llama 3, like advanced Natural Language Processing, real-time application integrations, and visual inference tasks.
Meta AI has also partnered with industry-leading partners to make the sleeker AI model Llama 3.2 available on Qualcomm and MediaTek System on Chips (SoCs) with Arm CPUs. The partnership aims to empower advanced performance capabilities on consumer-grade hardware, tapping a broader audience network and popular platforms. Llama 3.2 underscores the importance of addressing scalability issues common for businesses and research organizations while maintaining a high level of performance. Early benchmarks indicate that Quantized Llama 3.2 performs approximately 95% of the full Llama 3 model at 60% less memory usage. This achievement will help establish higher credibility for AI chatbots in terms of reducing the environmental impact of training and deploying LLMs.
By Arpit Dubey
Arpit is a dreamer, wanderer, and tech nerd who loves to jot down tech musings and updates. With a knack for crafting compelling narratives, Arpit has a sharp specialization in everything: from Predictive Analytics to Game Development, along with artificial intelligence (AI), Cloud Computing, IoT, and let’s not forget SaaS, healthcare, and more. Arpit crafts content that’s as strategic as it is compelling. With a Logician's mind, he is always chasing sunrises and tech advancements while secretly preparing for the robot uprising.
// Recommended
Pinterest Follows Amazon in Layoffs Trend, Shares Fall by 9%
AI-driven restructuring fuels Pinterest layoffs, mirroring Amazon’s strategy, as investors react sharply and question short-term growth and advertising momentum.
Clawdbot Rebrands to "Moltbot" After Anthropic Trademark Pressure: The Viral AI Agent That’s Selling Mac Minis
Clawdbot is now Moltbot. The open-source AI agent was renamed after Anthropic cited trademark concerns regarding its similarity to their Claude models.
Amazon Bungles 'Project Dawn' Layoff Launch With Premature Internal Email Leak
"Project Dawn" leaks trigger widespread panic as an accidental email leaves thousands of Amazon employees bracing for a corporate cull.
OpenAI Launches Prism, an AI-Native Workspace to Shake Up Scientific Research
Prism transforms the scientific workflow by automating LaTeX, citing literature, and turning raw research into publication-ready papers with GPT-5.2 precision.
Have newsworthy information in tech we can share with our community?
