Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (2024)

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (1)

Moshi Chat is a new native speech AI model from French startup Kyutai, promising a similar experience to GPT-4o where it understands your tone of voice and can be interrupted.

Unlike GPT-4o, Moshi is a smaller model and can be installed locally and run offline. This could be perfect for the future of smart home appliances — if they can improve the responsiveness.

I had several conversations with Moshi. Each lasts up to five minutes in the current online demo and in every case it ended with it repeating the same word over and over, losing cohesion.

In one of the conversations it started to argue with me, flat out refusing to tell me a story, demanding instead to state a fact and wouldn’t let up until I said “tell me a fact.”

This is all likely an issue of context window size and compute resources that can be easily solved over time. While OpenAI doesn’t need to worry about the competition from Moshi yet, it does show that as with Sora, where Luma Labs, Runway and others are pressing against its quality — others are catching up.

What is Moshi Chat?

Testing Moshi Chat —ÂAI speech-to-speech - YouTubeMoshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (2)

Watch On

Moshi Chat is the brainchild of the Kyutai research lab and was built from scratch six months ago by a team of eight researchers. The goal is to make it open and build on the new model over time, but this is the first openly accessible native generative voice AI.

“This new type of technology makes it possible for the first time to communicate in a smooth, natural and expressive way with an AI,” the company said in a statement.

Get the top Amazon Prime Day deals right in your inbox: Sign up now!

Receive the hottest deals and product recommendations alongside the biggest tech news from the Tom's Guide team straight to your inbox!

Its core functionality is similar to OpenAI’s GPT-4o but from a much smaller model. It is also available to use today, whereas GPT-4o advanced voice won’t be widely available until Fall.

The team suggests Moshi could be used in roleplay scenarios or even as a coach to spur you on while you train. The plan is to work with the community and make it open so others can build on top of and further fine-tune the AI.

It is a 7B parameter multimodal model called Helium trained on text and audio codecs, but Moshi is speech in speech out natively. It can run on an Nvidia GPU, Apple's Metal or a CPU.

What happens next with Moshi?

Moshi Keynote - Kyutai - YouTubeMoshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (3)

Watch On

Kyutai hopes that the community support will be used to enhance Moshi's knowledge base and factuality. These have been limited because it is a lightweight base model, but it is hoped that expanding these aspects in combination with native speech will create a powerful assistant.

The next stage is to further refine the model and scale it up to allow for more complex and longer form conversations with Moshi.

In using it and from watching the demos I’ve found it incredibly fast and responsive for the first minute or so, but the longer the conversation goes on the more incoherent it becomes. Its lack of knowledge is also obvious and if you cal it out for making a mistake it gets flustered and goes into a loop of "I’m sorry, I’m sorry, I’m sorry."

This isn’t a direct competitor for OpenAI’s GPT-4o advanced voice yet, even though advanced voice isn’t currently available. But, offering an open, locally running model that has the potential to work in much the same way is a significant step forward for open source AI development.

More from Tom's Guide

  • I just tried Runway’s new AI voiceover tool — and it’s way more natural sounding than I expected
  • Hume AI brings its creepy emotional AI chatbot to iPhone
  • ChatGPT Voice could change storytelling forever — new video shows it creating custom character voices

Category

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (4)

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (5)

Back to MacBook Air

Brand

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (6)

Processor

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (7)

RAM

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (8)

Screen Size

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (10)

Colour

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (11)

Storage Type

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (12)

Condition

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (13)

Price

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (14)

Any Price

Showing 10 of 90 deals

Filters

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (15)

Asus Zenbook S 13 OLED

(13.3-inch 1TB)

2

$1,399.99

View

Low Stock

Asus ROG Zephyrus G14 2023

8

$1,599.99

View

Asus ROG Zephyrus G14 2023

(14-inch 512GB)

10

$1,599.99

View

Load more deals

Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (36)

Ryan Morrison

AI Editor

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?

More about ai

OpenAI outlines plan for AGI — 5 steps to reach superintelligenceMicrosoft just made an AI voice generator so convincing it's too dangerous to release

Latest

I use a Kindle every day — here's 5 early Prime Day deals I recommend
See more latest►

No comments yetComment from the forums

    Most Popular
    Never copy and paste a password again – Proton Pass can now share them securely
    OnePlus Nord 4 will offer more software updates than any previous OnePlus phone — but Google still beats it
    Prime Video top 10 shows — here are the 3 worth watching now
    Forget iPhone 16 Pro, the big camera upgrades could come to iPhone 17 Pro and 19 Pro
    Microsoft just made an AI voice generator so convincing it's too dangerous to release
    NYT Strands today — hints, spangram and answers for game #131 (Friday, July 12 2024)
    Foldable phones have passed a very big milestone to going mainstream
    5 best shows like 'My Lady Jane' to stream after season 1
    Today's NYT Connections hints and answers — Friday, July 12, #397
    5 best dystopian shows on Netflix to stream right now
    Galaxy Watch 7 and Watch Ultra are losing one of best features in Samsung's smartwatches — but why?
    Moshi Chat's GPT-4o advanced voice competitor tried to argue with me — OpenAI doesn't need to worry just yet (2024)

    References

    Top Articles
    Latest Posts
    Article information

    Author: Madonna Wisozk

    Last Updated:

    Views: 6113

    Rating: 4.8 / 5 (68 voted)

    Reviews: 91% of readers found this page helpful

    Author information

    Name: Madonna Wisozk

    Birthday: 2001-02-23

    Address: 656 Gerhold Summit, Sidneyberg, FL 78179-2512

    Phone: +6742282696652

    Job: Customer Banking Liaison

    Hobby: Flower arranging, Yo-yoing, Tai chi, Rowing, Macrame, Urban exploration, Knife making

    Introduction: My name is Madonna Wisozk, I am a attractive, healthy, thoughtful, faithful, open, vivacious, zany person who loves writing and wants to share my knowledge and understanding with you.