The Future OS: Confirmed

I posted earlier this week about the new ChatGPT multimodal and how it's a glimpse into the future of how we interact with computing devices.

The next day, the All-In Podcast talked about much the same thing.

I think they—via the illustrations below—prove this belief out even more. (They also mention that inputs can (or will) have more options than just images and voice. Code Interpreter let's you input documents and code snippets, so that makes sense.)

Humans have five major sensory inputs we use to feed our brain computers. Up to this point, we've had one input type to interact with computers: text.

Multimodal is a way of mapping our multi-sensory experience onto interactions with computers, making them more natural for us.

We've adapted ourselves to the limits of computing. Now we might be able to adapt our computers to ourselves.

I don't love the examples OpenAI used in their demos. I see the hints of what they promise, but they lack oomph. This example though, this perfectly shows how you can use multimodal in your day-to-day to make life a bit easier:

I will never get a parking ticket again. pic.twitter.com/yl7ND2rJeQ
— Peter Yang (@petergyang) September 27, 2023

To drive this point home even further, here's one of the guys who'd be on the Mount Rushmore of modern AI development sharing his take:

With many 🧩 dropping recently, a more complete picture is emerging of LLMs not as a chatbot, but the kernel process of a new Operating System. E.g. today it orchestrates:

- Input & Output across modalities (text, audio, vision)
- Code interpreter, ability to write & run… pic.twitter.com/2HsyslOG2F
— Andrej Karpathy (@karpathy) September 28, 2023

TLDR looking at LLMs as chatbots is the same as looking at early computers as calculators. We're seeing an emergence of a whole new computing paradigm, and it is very early.

-Andrej Karpathy

And Google's BARD can now interface with a host of Google apps and services, almost creating a true smart, personalized assistant.

How do we extend our brand experiences into this new environment?

What does "brand" mean in this future?

(Basically, take all the questions people had about brands and the metaverse after the Grand Meta Rebranding™ and replace "metaverse" with "multimodal AI chat".)