The Anatomy of the AI Mind

In partnership with

THEWHITEBOX
The Anatomy of the AI Mind

Welcome back! This Leaders segment is going to go back to the roots of this newsletter and tackle the recently less discussed part of AI… the AI itself. We are exploring a new paradigm for AI models that may render the Transformer (ChatGPT’s architecture, which has dominated the entire industry for almost a decade) obsolete.

This time, the goal is to conquer continual learning, the ability of AI to learn forever, and the thing that all AI Labs are working on exhaustively, once and for all.

And this proposal isn’t coming from an unknown research team. Today’s proposal is coming from Google itself (creators of the Transformer), and they’ve announced this piece of research with all the bells and whistles, which is not typical behavior at all.

And with Google’s Gemini 3.0 just around the corner, and with some testers arguing that Gemini 3.0 is the biggest leap in performance since GPT-4 (the biggest real update in more than two years), I believe this is too much of a coincidence, and we might be at the cusp of the first true paradigm shift in years.

❝

If AI can truly be a winner-takes-all market, continual learning is as close as it gets to being the ‘it’ factor.

Take today’s post as less about investing, bubbles, and debt, and more so about what this newsletter was always about: understanding AI like no one around you without unnecessary jargon or esoteric wording.

Let’s dive in.

❝

This is a long read that requires concentration. So, please leave it for a time when you’re not too mentally drained. It’s not dull, but it will make you think.

Startups who switch to Intercom can save up to $12,000/year

Startups who read beehiiv can receive a 90% discount on Intercom's AI-first customer service platform, plus Fin—the #1 AI agent for customer service—free for a full year.

That's like having a full-time human support agent at no cost.

What’s included?

6 Advanced Seats
Fin Copilot for free
300 Fin Resolutions per month

Who’s eligible?

Intercom’s program is for high-growth, high-potential companies that are:

Up to series A (including A)
Currently not an Intercom customer
Up to 15 employees

Apply now

Anterograde Amnesia, ChatGPT’s Natural State

Current models “suffer” from a condition that in humans we know as ‘anterograde amnesia’.

This is a memory impairment in which the ability to form new long-term memories after the onset of the condition is disrupted. In contrast, older memories from before the event are largely preserved.

This is actually a spot-on way of understanding how models behave. They can learn new things, but the moment we train them on new material, they easily forget what they learned before, an “illness” known as catastrophic forgetting, which prevents this update and thus condemns the AI to this amnesia-like state.

But to understand why this might soon be a thing of the past, we must first answer: What is a model?

What it means to ‘train’ AI

An AI model is nothing but a ‘map’ between an input and an output. It takes an input, and it gives you back an output. More formally, it’s a learning algorithm that “studies” the data and finds the patterns that allow it to predict well an output derived from that data.

Suppose you are Zillow, the largest US real estate marketplace, and you want to offer customers a model that predicts the price of a home. They could theoretically train a model that takes in a lot of home data (number of rooms, postal code, size, bath count, etc.) in examples where we know the price, to find “what variables of the above determine the price.”

Once that model is trained, you can use it to predict the price of homes without a price tag by providing a price that is attuned to homes of similar characteristics. In a nutshell, that's what all AIs do; we just change the task objective.

Historically, AI researchers have performed statistical analyses to identify the features (the predictors) that best explain the variance of the dependent variable (e.g., what characteristics of a home best predict its actual price).

But with neural networks, which most AI models today are, we don’t have to do this. Instead, the model itself will find the “features that matter” autonomously.

But how does the model actually do that? We do a gradient backpropagation.

Think of ChatGPT as a huge panel with thousands of knobs (in reality, we have trillions of them at this point, as we know Grok 4 has three trillion parameters). This panel receives a sequence of text and must produce the following word in the sequence.

The key is that the configuration of these knobs influences what word is determined to be the most likely next word.

Therefore, training means tuning these knobs so that the loss expectation, on average, is small.

In other words, training an AI model (specifically, a neural network) is finding the precise knob configuration so that, on average, the prediction loss is minimal (i.e., the model’s predictions are very similar to the ground truths).

The model correctly predicts ‘Paris’. We like this knob configuration

Luckily, as AI models are huge mathematical functions, where each ‘knob’ serves as input of another knob while being the output of previous knobs (knobs are distributed in layers), from that error measurement we get, we can mathematically compute the positive/negative effect each knob had on the prediction, as from the last layer’s knobs we can “look back” and see how knobs in previous layers contributed to that mistake the most and correct them, in a similar fashion to a DJ moving the speed knob if they feel the song’s speed is too fast.

Knowing this, we can “turn a knob the other way” if its effect is negative, or reinforce a knob’s configuration if it’s doing its job and helping reduce the error.

The model predicted ‘Madrid’, so we trace back to see what knobs are at fault.

Thus, training involves processing a large dataset for an extended period until you eventually achieve a knob panel configuration that is satisfactory. However, we must introduce another important distinction here: not all knobs are equal.

Short-term knobs and long-term knobs

In a model such as ChatGPT, we have two knob types: attention and MLPs. We don’t have to dive into the details today; we just need to focus on the essentials of each.

The attention knobs are responsible for capturing contextual information in the input sequence. These are the knobs that capture to what noun an adjective refers, for instance.
The MLP knobs allow the model to store information that may be important for future predictions.

Thus, attention serves as a ‘working memory’ of sorts, capturing short-term patterns, and the MLPs serve as the long-term memory, capturing patterns across potentially millions of examples. Hold this thought for later because it’s vital.

❝

For instance, if you ask ChatGPT, “What sport does Luka Dončić play?” the attention knobs will capture that ‘Luka’ refers to Dončić, not to Modric (a soccer star), and the MLP will use the attention’s understanding that it’s a player (because ‘sport’ appears in the sequence) known as “Luka Dončić”, to find the answer is “basketball,” even if no reference to basketball was made in the sequence.

This is how ChatGPT works, but how do these models behave?

The training phase is the knob tuning we have just described. ChatGPT is a very intuitive example. We provide it with a sequence, and the model must predict (or generate in the case of Generative models) the exact sequence. For every prediction, the model considers all the words in its vocabulary and chooses the one that is most likely according to it.

We then measure the probability that the model assigned the correct word and assess how far it was from 100% (e.g., the correct word was ‘duck’ and the model gave ‘duck’ a 20% chance of being the next word, rather than assigning it a 100% chance. That gap is the loss.

❝

What we are doing here is comparing the model to a perfect distribution (i.e., how far the model is from a ‘perfect model’ that gives a 100% chance to every single next word in the sequence).

Why? By doing so, the model gets closer and closer to this “target” distribution.

❝

It never really quite gets to match the target distribution, because we don’t want to overfit the model to the data, an instance where the model is too fixated on this particular dataset, making it useless to new data (e.g., a model sees too many black cats and jumps into the wrong conclusion that all cats are black, making it incapable of identifying non-black cats).

However, and this is one of the key things to take away today, you may assume that the only way models “learn” is by tuning the knobs (by training the model in the literal sense).

Well, this is wrong.

Sequence models adapt to learn, too

I want to reinforce this idea of models as memory hierarchies, because it’s crucial to understand where Google is taking us. A few moments ago, I mentioned that AIs learn by adjusting the knobs, which we refer to as training.

However, models like ChatGPT have an additional advantage, stemming from the attention mechanism and the working memory we just described, which captures the contextual information within the sequence. This weapon is adaptability, and bears the name of in-context learning (ICL).

❝

In other words, models like ChatGPT can also learn by adapting themselves to the prompt.

You may have noticed in your interactions with ChatGPT or Gemini that if you provide detailed information about yourself in the prompt you give the model, it suddenly seems to know things about you.

This is not because OpenAI is training the model under the hood to serve you on the fly; it’s an emerging property of large language models (LLMs).

❝

This property, ICL, occurs when the model can effectively use the new context provided in the prompt, even though it had never seen that information until that point.

Using our knob analogy, this means the attention knobs collaborate to adapt based on the prompt!

Importantly, knob configurations remain intact (we aren’t modifying the underlying model), but they can combine in previously unseen ways to adapt to your request.

Think of this in human terms, as if you approach a woman in a formal meeting, and you’re inclined to give them two kisses (this is very, very southern European behavior, even in formal encounters, but extremely uncommon in the northern ones), but she offers you her hand, to which you immediately correct your actions and shake her hand.

Will you remember this particular event for the ages and start giving the hand as a European southerner?

Most likely not, this particular event is uncommon in your country, so your future instincts will still be giving two kisses (i.e., you would need many more episodes like that one to shape the way you approach women in formal meetings in Spain).

However, your brain still adapted and corrected the behavior, which introduces us to three vital takeaways for today:

Human brains adapt without actually performing knob tuning (modifying long-term behaviors, aka training the model on the interaction)
Modifying long-term behaviors requires repetition of events, a pattern that is stretched across many repeated episodes.
Real-world patterns have very distinct effective frequencies; some patterns are immediately noticeable, others take years.

I know what you’re thinking. What is this writer on about with the third takeaway? Stay put, because this will shape your overall view of AI.

We are setting AIs for failure

As you may infer from our explanation of how ChatGPT operates internally, these knobs are not chosen or distributed arbitrarily.

We significantly influence what models learn by shaping them in specific ways. And the truth is that, to this day, we are absolutely terrible at it, almost primitive.

But that changes today.

Subscribe to Full Premium package to read the rest.

Become a paying subscriber of Full Premium package to get access to this post and other subscriber-only content.

Upgrade

A subscription gets you:

NO ADS
An additional insights email on Tuesdays
Gain access to TheWhiteBox's knowledge base to access four times more content than the free version on markets, cutting-edge research, company deep dives, AI engineering tips, & more

The Anatomy of the AI Mind

THEWHITEBOX
The Anatomy of the AI Mind

Startups who switch to Intercom can save up to $12,000/year