Zenith

  • Latest News

    Monday, April 15, 2024

    xAI Previews Grok-1.5V, Its First Multimodal Model


    Elon Musk’s xAI has recently announced its first multimodal model Grok-1.5 Vision, aka Grok 1.5V. This comes after the company’s last month’s announcement of Grok-1 AI to take on ChatGPT.

    The company’s first multimodal model Grok 1.5V not only understands text but is also capable of image processing. It can process everything it sees in documents, images, screenshots, charts, as well as diagrams. In a recent blog post, talking of Grok-1.5 Vision’s capabilities, the company mentioned:

    Grok-1.5 Vision outperforms its rival in the RealWorldQA benchmark

    The company also detailed the advanced capabilities of the Grok-1.5 Vision with seven different samples which are as follows:

    • Writing code from a diagram
    • Calculating Calories
    • From a drawing to a bedtime story
    • Explaining a meme
    • Converting a table to CSV
    • Help with rotten wood on a deck
    • Solving a coding problem

    Musk-led AI company also shared a comparison chart to compare its first multimodal model with its rivals. Testing results show that Grok-1.5 Vision stands tall against its competitors like GPT-4 with Vision, Claud 3 Sonnet/Opus, and Gemini Pro 1.5.


    While the results look promising, xAI’s Grok-1.5V outshines all its competitors in the RealWorldQA benchmark. According to the company, RealWorldQA is a new benchmark designed to evaluate basic real-world spatial understanding capabilities of multimodal models.

    Well, it is pretty clear that Musk’s AI company is in no mood to take the backseat and is aggressively making moves to keep up with its rival. However, we can’t deny the fact that its AI models have received a fair amount of criticism in the past. Recently, Grok AI was criticized for misinformation and more.

    Lastly, Grok-1.5V will soon be available to the existing Grok users and early testers out there. So, if you are among the early testers of Grok-1.5 Vision.

    Grok-1.5V follows the recent introduction of Grok-1.5 by xAI, featuring enhanced reasoning capabilities and a context length of 128,000 tokens. Grok-1.5 boasts notable improvements, particularly in coding and math-related tasks. It beats Mistral Large on various benchmarks including MMLU, GSM8K and HumanEval.

    • Blogger Comments
    • Facebook Comments

    0 comments:

    Item Reviewed: xAI Previews Grok-1.5V, Its First Multimodal Model Rating: 5 Reviewed By: BrandIconImage
    Scroll to Top