Elon Musk Announces Grok-2 Beta With Powerful Capabilities
Date: August 14, 2024
Elon Musk has announced a new version of Grok AI chatbot in the making, which outperformed leading chatbots like Claude 3.5 Sonnet and GPT-4-Turbo.
Elon Musk has made an exciting announcement that will impact the user experience of his X platform. xAI, his AI startup has released an early version of Grok-2 Beta under the name "sus-column-r". He put Grok-2 beta through one of the most competitive language model benchmark tests, the LMSYS chatbot arena. The results showed that Grok-2 Beta outperformed Claude 3.5 Sonnet and GPT-4o Turbo in terms of its overall Elo score.
The official blog mentioned how Grok-2 Beta is trained using a unique method, “Our AI Tutors engage with our models across a variety of tasks that reflect real-world interactions with Grok. During each interaction, the AI Tutors are presented with two responses generated by Grok. They select the superior response based on specific criteria outlined in our guidelines.”
The team focused on evaluating two key capability parameters: Following instructions and providing accurate and factual information. The results Grok-2 Beta produced were significantly better than its previous model in terms of reasoning, reading comprehension, math, science, and coding. It also reached competitive scores among top AI chatbots regarding graduate-level science knowledge (GPQA), general knowledge (MMLU, MMLU-Pro), and math competition problems (MATH). Additionally, Grok-2 excels in vision-based tasks, delivering state-of-the-art performance in visual math reasoning (MathVista) and in document-based question answering (DocVQA).
We focused on evaluating model capabilities in two key areas: following instructions and providing accurate, factual information. Grok-2 has shown significant improvements in reasoning with retrieved content and in its tool use capabilities, such as correctly identifying missing information, reasoning through sequences of events, and discarding irrelevant posts.
Musk has also announced a smaller version of Grok-2, the Grok-2 Mini, which is a highly compact and capable AI chatbot. Both Grok-2 and Mini will be available to all X Premium and Premium+ subscribers. The AI art generator feature of Grok-2 can be a game-changer in the AI ladnscape, potentially improving its global marketshare.
By Arpit Dubey
Arpit is a dreamer, wanderer, and tech nerd who loves to jot down tech musings and updates. With a knack for crafting compelling narratives, Arpit has a sharp specialization in everything: from Predictive Analytics to Game Development, along with artificial intelligence (AI), Cloud Computing, IoT, and let’s not forget SaaS, healthcare, and more. Arpit crafts content that’s as strategic as it is compelling. With a Logician's mind, he is always chasing sunrises and tech advancements while secretly preparing for the robot uprising.
// Recommended
Pinterest Follows Amazon in Layoffs Trend, Shares Fall by 9%
AI-driven restructuring fuels Pinterest layoffs, mirroring Amazon’s strategy, as investors react sharply and question short-term growth and advertising momentum.
Clawdbot Rebrands to "Moltbot" After Anthropic Trademark Pressure: The Viral AI Agent That’s Selling Mac Minis
Clawdbot is now Moltbot. The open-source AI agent was renamed after Anthropic cited trademark concerns regarding its similarity to their Claude models.
Amazon Bungles 'Project Dawn' Layoff Launch With Premature Internal Email Leak
"Project Dawn" leaks trigger widespread panic as an accidental email leaves thousands of Amazon employees bracing for a corporate cull.
OpenAI Launches Prism, an AI-Native Workspace to Shake Up Scientific Research
Prism transforms the scientific workflow by automating LaTeX, citing literature, and turning raw research into publication-ready papers with GPT-5.2 precision.
Have newsworthy information in tech we can share with our community?
