xAI Previews Grok-1.5V, Its First Multimodal Model

BrandIconImage

Elon Musk’s xAI has recently announced its first multimodal model Grok-1.5 Vision, aka Grok 1.5V. This comes after the company’s last month’s announcement of Grok-1 AI to take on ChatGPT.

The company’s first multimodal model Grok 1.5V not only understands text but is also capable of image processing. It can process everything it sees in documents, images, screenshots, charts, as well as diagrams. In a recent blog post, talking of Grok-1.5 Vision’s capabilities, the company mentioned:

Grok-1.5 Vision outperforms its rival in the RealWorldQA benchmark

The company also detailed the advanced capabilities of the Grok-1.5 Vision with seven different samples which are as follows:

Writing code from a diagram
Calculating Calories
From a drawing to a bedtime story
Explaining a meme
Converting a table to CSV
Help with rotten wood on a deck
Solving a coding problem

Musk-led AI company also shared a comparison chart to compare its first multimodal model with its rivals. Testing results show that Grok-1.5 Vision stands tall against its competitors like GPT-4 with Vision, Claud 3 Sonnet/Opus, and Gemini Pro 1.5.

While the results look promising, xAI’s Grok-1.5V outshines all its competitors in the RealWorldQA benchmark. According to the company, RealWorldQA is a new benchmark designed to evaluate basic real-world spatial understanding capabilities of multimodal models.

Well, it is pretty clear that Musk’s AI company is in no mood to take the backseat and is aggressively making moves to keep up with its rival. However, we can’t deny the fact that its AI models have received a fair amount of criticism in the past. Recently, Grok AI was criticized for misinformation and more.

Lastly, Grok-1.5V will soon be available to the existing Grok users and early testers out there. So, if you are among the early testers of Grok-1.5 Vision.

Grok-1.5V follows the recent introduction of Grok-1.5 by xAI, featuring enhanced reasoning capabilities and a context length of 128,000 tokens. Grok-1.5 boasts notable improvements, particularly in coding and math-related tasks. It beats Mistral Large on various benchmarks including MMLU, GSM8K and HumanEval.

Navigation

Search This Blog

Nigeria’s Agro Export Ambition Deepens as Sunbeth Unveils Major Cocoa, Cashew Processing Drive

Federal Universities’ Personnel Costs Surge by N94bn as FG Implements 40% Salary Increase for Lecturers

Chike Sparks Reactions With Cryptic Post Amid Affair Allegations

The Man Shekau ...The Missing Chibok Girls

Google Updates Search, News to Help Users Identify Trusted Sources

South African Fans to Get Access to All 104 FIFA World Cup 2026 Matches for R10 on SportyTV

Beyond the Vibe: Bridging Africa’s Build Divide with Intelligent Infrastructure

Five Minutes of Daily Exercise May Help Reduce Risk of Early Death, Study Finds

Local Refining, Investment Choices Lead Nigeria's Energy Advancements in 2024

Sustainable Aviation Fuel Gains Attention Amid Push to Cut Air Travel Emissions

xAI Previews Grok-1.5V, Its First Multimodal Model

Post a Comment

Nigeria’s Agro Export Ambition Deepens as Sunbeth Unveils Major Cocoa, Cashew Processing Drive

Local Refining, Investment Choices Lead Nigeria's Energy Advancements in 2024