Elon Musk’s xAI has recently announced its first multimodal model Grok-1.5 Vision, aka Grok 1.5V. This comes after the company’s last month’s announcement of Grok-1 AI to take on ChatGPT.
The company’s first multimodal model Grok 1.5V not only
understands text but is also capable of image processing. It can process
everything it sees in documents, images, screenshots, charts, as well as
diagrams. In a recent blog post, talking of Grok-1.5 Vision’s capabilities, the
company mentioned:
Grok-1.5 Vision outperforms its rival in the RealWorldQA
benchmark
The company also detailed the advanced capabilities of the
Grok-1.5 Vision with seven different samples which are as follows:
- Writing code from a diagram
- Calculating Calories
- From a drawing to a bedtime story
- Explaining a meme
- Converting a table to CSV
- Help with rotten wood on a deck
- Solving a coding problem
Musk-led AI company also shared a comparison chart to
compare its first multimodal model with its rivals. Testing results show that
Grok-1.5 Vision stands tall against its competitors like GPT-4 with Vision,
Claud 3 Sonnet/Opus, and Gemini Pro 1.5.
While the results look promising, xAI’s Grok-1.5V outshines all its competitors in the RealWorldQA benchmark. According to the company, RealWorldQA is a new benchmark designed to evaluate basic real-world spatial understanding capabilities of multimodal models.
Well, it is pretty clear that Musk’s AI company is in no
mood to take the backseat and is aggressively making moves to keep up with its
rival. However, we can’t deny the fact that its AI models have received a fair
amount of criticism in the past. Recently, Grok AI was criticized for
misinformation and more.
Lastly, Grok-1.5V will soon be available to the existing
Grok users and early testers out there. So, if you are among the early testers
of Grok-1.5 Vision.
Grok-1.5V follows the recent introduction of Grok-1.5 by xAI, featuring enhanced reasoning capabilities and a context length of 128,000 tokens. Grok-1.5 boasts notable improvements, particularly in coding and math-related tasks. It beats Mistral Large on various benchmarks including MMLU, GSM8K and HumanEval.
0 comments:
Post a Comment