The Surprising Evolution of AI Interaction

This Week I Discovered the Unseen Dynamics of AI Evolution

Good morning. This week, I am writing you from the Midwest. I am in Normal, IL, where our Rivian plant is located. Every time I drive up to the plant and see its sheer size and massive scale of operations, I think about how one guy had this idea to start a car company and then pulled it off.

It takes an insane amount of conviction and patience to spin up an almost 4 million square feet production facility that costs billions, negotiate hundreds of supplier contracts, build out a delivery network, hire thousands of people, and fully equip the factory with tools and production lines. And now we already made over 100,000 vehicles here. There is nothing easy about that, but it’s a hell of a journey.

Let’s go back to AI business. This week it’s time again to discuss the recent AI development with you. I had some interesting conversations about the unseen dynamics in the AI evolution.

You might say, "But Sebastian, I am here for the strategy analysis." And that’s fair. We’ve done a lot of these.

But hear me out. Depending on the product you are working on, strategy and planning involve a great portion of market observation, product discovery, and consumer behavior analysis. So, how cool would it be if I sprinkled in some of those once in a while? Totally free for you as always.

It’s a 5-minute-read as usual.

What will ChatGPT-17 look like?

180 million users are eagerly awaiting the next wave of AI, the next big feature update, GPT-5.

But what we don’t think about enough is that in 5-10 years, all the advancements we get so excited about today will feel like rudimentary demo versions.

GPT-17 will be in an entirely new dimension, but nobody can explain how…

Compare this to the first iPhone that was introduced to the world. Without any app from the App Store. Compared to what it is today and what it did to the world…

Who could have foreseen how the iPhone changed the world? How will AI impact the world? We don’t know yet, but we can take guesses.

I like to explore this worthy thought experiment once in a while.

What’s known for being next for ChatGPT?

We have some upcoming enhancements that show the near-term direction in some way.

Near-term enhancements

  • Advancing Multimodality: Real-time data processing will enable tasks like language translation or AR/VR interaction. Understanding different data types, such as audio, video, and text, gives the AI much more context about interactions.

  • Enhanced Contextual Understanding: It will have a longer-lasting memory and better understand its users and what is relevant to them.

  • Bias and Regulation: As legislation catches up, more data protection will be introduced, and more efforts will be made to reduce bias.

But all this stuff above is known and expected. + Larger models, more efficiency, better understanding of emotions, and better industry-specific models.

We’ve also seen many companies implement, experiment with, and launch AI integrations. These integrations are often in the form of assistants, like Priceline’s “Penny”, an AI bot that helps people book vacations.

What does the travel AI “Penny” do?

  • Finding and booking the perfect hotel based on your needs.

  • Making phone calls to hotels to make special requests.

  • Manage your whole trip package - Rent a car, book activities, and add them to your calendar.

It’s all coming. But let’s talk about the more unexpected, juicy stuff.

Emergent and Unpredictable Capabilities

Bot-to-bot Interactions are Emerging

Now imagine this: “Penny,” the AI travel agent, is calling a hotel to make a reservation on your behalf. Guess who will pick up the phone for the hotel in the future?

It will be “Leslie,” the bot built for the hotel chain to take hotel reservations.

Loman.ai has bots already rolled out to hundreds of restaurants across the US. Indeed, they detected interactions between the restaurant bot and an incoming call from a Google bot in their logs.

The bots conversed about making a reservation, the kid's menu, and the availability of high chairs. The interaction worked well.

Restaurant chains like Panera are experimenting with AI ordering in drive-throughs, and more fast-food restaurants are also experimenting with kitchen robots. So, the path to an end-to-end automated restaurant is clear.

Source: WSJ

There we have it—the bot-to-bot interactions are starting. But how will it continue? That does not seem at all clear to anyone.

  • Will bots develop their communication patterns? Maybe they can be more efficient than we are…

  • Will bots always operate in our best interest?

  • Will bots go beyond their task to solve additional problems that arise?

Are we merely going to be actors on the sidelines, getting in cars that pick us up and take us wherever Penny and Leslie think we’d want to go and what we can afford? And to continue this vision…of course, they will have checked in with “Bobby” before.

Bobby is the bot that picks up the phone at our bank. We authorize Bobby to inform our other personal bots about our financial situation. He knows what we typically spend on a vacation or a night out.

Do we want to organize a party? Penny, Leslie, and Bobby will figure out where, what, and when. They will contact all of your friends' bots to make sure they RSVP and bring “no gifts, please.” The only shame is that Penny, Leslie, and Bobby will be our closest companions, but they are the ones who can’t come to the party.

The one point I am getting to is that we know that banks, for example, are working on financial advisor bots, but that is a small stepping stone.

If the bot tells me I have some short-term cash, I should put into T-bills… then it should just do it.

If the bot tells me that I need to renegotiate my car insurance, it should just do it.

I could think of a million examples where bots that advise me to do things should do them for me instead. Whether I trust the bot to do the right thing at this current stage is a different discussion.

AI is Capable of Training Itself

Since GPT3 users can observe that AI can surface emergent behaviors. Meaning there are things AI knows without anyone training it on. This is called “zero-shot” or “few-shot.” AI’s ability to solve problems it has never (or rarely) seen before.

The first hints of “zero-shots” came in an experiment when the AI was asked this question:

What movie do these emojis describe?

👧🐟🐠🐡

The answers were found to depend on the model's size (complexity).

The simplest model would reply, "The movie is a movie about a man who is a man who is a man." This is not very convincing.

A medium size (complexity) model was closer with the guess “The Emoji Movie”.

But the shocking result came from the most complex model (similar to a ChatGPT LLM) - it nailed the expected answer: “Finding Nemo.”

The LLM models were developed and trained to expect and make sense of a text string. Not to make sense of a bunch of emojis. Even the researchers were surprised to see these “emergent” abilities.

So, now, researchers have started to create lists to keep track of these emergent capabilities.

It is somewhat eerie to know that aside from collecting the unpredictable capabilities, researchers still don’t understand where and how these extra skills came to be.

This also means a lot is going on, and a lot of evolution will happen that we absolutely can’t foresee.

The layer of abstraction we are missing

Just like AI, I evolved while writing this post. This, by the way, is exactly why writing this newsletter makes me so happy. I learn, and I think more. I hope I can create some “learn-happiness” with you as well.

I was going to conclude that the user experience needs to evolve. I wanted to write about how users should expand their expectations, learn new modes of interaction, and collaborate with AI even more.

But then it dawned on me.

Users should do the exact opposite. Everyone is talking about learning prompt engineering and how to optimize your AI interactions to get the most out of it.

I now strongly believe the opposite.

AI will soon have a much simpler user experience. Because instead of learning how to interact with different AI systems, the user delegates all tasks that lead to a specific goal to ONE AI. This AI knows how to get the best results from another AI. Why would the user need special knowledge to solve a task via AI?

  • AI as the Mediator: Your personalized “AI friend” knows how your goals are met efficiently.

  • AI Chains: Your personalized “AI friend” connects with all the AIs needed to achieve the outcome you are looking for. The user initiates the process, and the AIs coordinate amongst themselves.

  • The simplest AI interface: An AI interface should ask for the user's end goal and then get it done. Initial versions provide an audit trail and maybe an approval chain from users. However, the user can opt to auto-approve once they gain trust.

These thoughts open up the conversation about how we could shift our focus from direct interaction with specialized AI towards managing AI relationships.

We don’t know what’s going to happen with AI but whatever vision we make up, we are likely underestimating it.

I hope this opens your mind as much as it did mine. I’d love to hear your thoughts on it.

Have a great rest of the week,

How did you like this edition?

Login or Subscribe to participate in polls.

Disclaimer: The views, thoughts, and opinions expressed in the text belong solely to the author.

Reply

or to participate.