- TheWhiteBox by Nacho de Gregorio
- Posts
- OpenAI's Roadmap, Bringing Pixar's Lamp to Life, & More
OpenAI's Roadmap, Bringing Pixar's Lamp to Life, & More


The #1 AI Meeting Assistant
Summarize 1-hour meetings into one-page recaps
Say goodbye to post-meeting data entry
Claim 30 days of unlimited AI notes today

THEWHITEBOX
TLDR;
Today, we analyze OpenAI’s announced roadmap, their “discovery” that proves DeepSeek was right, the strong yet concerning state of SuperMicro, the dangers of flirting AIs, the first on-device LLM app, and Apple’s first robotics endeavor: bringing back Pixar’s lamp to life by proposing our trend of the week: ‘emotion-displaying robots.’

FRONTIER RESEARCH
OpenAI’s Roadmap
In a viral tweet, Sam Altman explained OpenAI’s roadmap for the upcoming weeks and months. First, he announced that Orion, the secretive model thought to be the key behind o3, is now called GPT-4.5 and will be released next. He also stated that it would be their ‘last non-reasoning-model release.’
But what does he mean by that? Well, they’ve decided that the next release after GPT-4.5, GPT-5, will consolidate all models into a single offering; there will be no more model dropdowns.
In other words, he announced GPT-5, a system of models, including o3. In summary, they want to create a single point of contact with the user who can handle various tasks, dynamically and autonomously switching between models depending on the typology of the user’s request.
Some considerations. What Sam means by GPT-4.5 (Orion) not being a reasoning model is that it’s a pre-trained model trained to compress trillions of words (and probably images, video, audio…) of knowledge into its weights. It's a new update of GPT-4, the base model underpinning all ChatGPT models for the last two years (probably even including o3-mini, although we know they post-trained it with Orion data).
But what is the difference between a non-reasoning GPT-4.5 and a reasoning model like o3?
TheWhiteBox’s takeaway:
The funny thing is that, behind closed doors, non-reasoning and reasoning models aren’t that different, as the base of reasoning models is a non-reasoning model. However, there’s a stark difference in how they approach problem-solving: one takes its time, taking a multi-step approach, and the other responds automatically, without hesitation.
I always find the System 1 / 2 theory of mind by Daniel Kahneman (Thinking fast and slow), consistently referenced by these AI labs, as a great way to difference both models: non-reasoning models are akin to human intuition (or System 1 thinking); they are fast and automatic, they commit to a particular response with no “second thoughts.“ On the other hand, reasoning models are slow and deliberate, they take their time to answer and can reflect, backtrack, or search for better solutions in real-time, akin to humans engaging their prefrontal cortex to solve a complex task (System 2 thinking).
Of course, choosing one or the other depends on the task type. If the user asks for a creative poem, this is instinctive for someone trained in the art of poetry (as GPT-4.5 will be); thus, thinking fast is the way to go. Furthermore, if they ask questions requiring knowledge if the model knows the answer, it just knows it, so there is no need to think for longer about it.
But if the user asks to solve a complex maths problem, while the fast thinker could get the answer correctly in one go, it most likely benefits from thinking for a little bit longer on the task, allowing itself to correct wrong assumptions, iterate, expand, and converge into a solution it deems appropriate. As Noam Brown, Reasoning Lead at OpenAI puts it, “some tasks benefit from thinking for longer on them.”
Therefore, GPT-5 is a system that encompasses all models and tools and dynamically routes the user’s request to the most appropriate model using a router (this router will be similar to the services offered by companies like Martian, but internal and automatic for ChatGPT users). OpenAI is trying to become the backend of intelligence and knowledge, a single point of interaction where the human asks, and the best model responds.
It seems DeepSeek did shake things up a little bit, huh? Since the Chinese release, OpenAI has gone full shipping mode, releasing o3-mini (low and high thinking) and just now announced file support for o1 and o3-mini while also increasing the usage of o3-mini for the Plus tier ($20/month).
Do we need any more proof that no one benefits more than customers when open-source is allowed? As showcased here, it actively forces AI companies to compete fiercely (Google is also increasing shipping rates considerably and has acknowledged sharing the same vision about AI systems as OpenAI).
HARDWARE
Strong SuperMicro Earnings, But…
Supermicro has announced an anticipated net sales figure of $5.6 to $5.7 billion for the second quarter of fiscal year 2025, marking a 54% year-over-year increase.
This growth is primarily driven by heightened demand for AI-related platforms, which now constitute over 70% of the company’s revenue across both enterprise and cloud service provider markets. The company attributes this surge to its offerings in air-cooled and direct liquid cooling (DLC) rack-scale AI GPU platforms.
GPUs heat up a lot, so they need ever more so cooling power, to the point that most Blackwell chips demand liquid cooling instead of fans.
However, despite these positive developments, Supermicro has adjusted its fiscal year 2025 revenue forecast to a range of $23.5 to $25 billion, down from the previously projected $26 to $30 billion. This revision is primarily due to supply constraints related to Nvidia’s Blackwell GPUs, which have delayed product shipments. CEO Charles Liang noted that while the company is prepared to ship with liquid cooling solutions, delays in receiving Blackwell products have significantly impacted their revenue projections.
Looking ahead, Supermicro projects revenues of $40 billion for fiscal year 2026, which exceeds analysts’ expectations of $30 billion. The company plans to expand its manufacturing capacity, particularly in the U.S., to meet the increasing demand for its products and services.
TheWhiteBox’s takeaway:
This is all well and good for the main cooling supplier for NVIDIA chips; it’s no surprise they are thriving. However, the long shadow cast upon them last year due to fears of ‘dubious’ auditing practices remains a problem, as they have yet to deliver the annual report. Supermicro aims to file its delayed annual report with the SEC by February 25, 2025.
Therefore, to me, the biggest takeaway here is NVIDIA. What is going on with Blackwell?
The new GPU platform should have started deliveries in Q4 2024, but it still has to make a single shipment, to the point that many Hyperscalers switched back to H100 orders in the meantime. The issue is that these problems will only get worse over time, as NVIDIA needs to create more cumbersome compute/memory architectures due to the architecture of GPUs. Every day that the shipment delay endures is great news for challengers who aren’t constrained by the limits of the GPU.
FRONTIER RESEARCH
o3 Wins Gold in the ‘DeepSeek way’

Reasoning models that achieve high ELO (a measure of performance) in competitive programming benchmarks like Codeforces, as shown above, or win a gold medal in programming Olympics, like o1 already did, aren’t new.
But OpenAI’s latest paper shows how vital the DeepSeek lesson, originally the Bitter Lesson by Rich Sutton, is. They found that o3 blew past o1’s results (aka, achieved gold model status) without all the clever heuristics OpenAI had to apply to o1 to achieve the result (careful filtering, teaching key reasoning priors, etc.). Instead, o3 figured all these out by itself through more extended Reinforcement Learning at scale.
Does that sound familiar? Exactly, it’s the DeepSeek recipe! In layman’s terms, the lesson is always the same: the key to progress in artificial intelligence is more compute and fewer human biases and hacks. DeepSeek took R1 to the o1 level without teaching it to reflect on its responses, backtrack, explore other alternatives, and so on. Instead, the model figured them all out by itself simply by ‘guessing and verifying.’
And now, it seems OpenAI arrived at the same conclusion when training o3.
TheWhiteBox’s takeaway:
We can’t undermine how important the DeepSeek release has been. OpenAI’s paper proves that they had reached a similar conclusion to train o3 but decided to hide it from the rest of the world. Only after DeepSEek came out, suddenly they came out saying, ‘Yes, of course, we already knew that!’
This is yet another proof that open source has to remain viable and supported by the community, unless we want these breakthroughs to remain behind closed doors.
Nevertheless, OpenAI’s drop in prices is due to open-source; it’s not because of its love for humanity. Don’t forget that.
LOVE
A Flirting AI
A young developer has created a flirting AI that ‘moves five steps ahead in a conversation’ to dictate what you should say based on your goal (in this case, flirting). In the video, the developer shows how the AI recommends exactly what to say at every step based on the user’s goal.
TheWhiteBox’s takeaway:
While I appreciate the kid’s developer capabilities, I hate this with a passion.
Our younger generations already struggle to avoid being excessively awkward in social interactions. Most have been raised inside their homes, in households full of tablets, social media reels, and tiny human connection.
Nonetheless, in surveys, job interviewers claimed that some kids brought their parents to the interviews and couldn’t even hold eye contact.
And now, AI offers yet another excuse for them not to get their shit together and instead let the AIs do the talking, even when you're getting to know what could be your future significant other.
There’s certainly a case to be made that AI could push humanity backward in several regards, especially in social interactions.
But this isn’t only a social problem, as Microsoft recently showed via a study that claimed AI made human cognition become “atrophied and unprepared.” Another study by an MIT professor showed how a cohort combining ChatGPT and humans severely underperformed one with just Google search, showing how AI can actually decrease our problem-solving capacity.
Will AI make us dumber, boring, and unremarkable the smarter, more interesting, and remarkable AI becomes?
ON-DEVICE
LLM App with On-Device Models
Allen Institute for AI (Ai2) has launched OLMoE, an open-source iOS app enabling users to run advanced language models directly ‘on-device’ for privacy and security. The app supports iPhone 15 Pro or newer and M-series iPads, allowing researchers and developers to test AI models locally without an internet connection.
OLMoE is fully open-source, allowing integration into other apps or AI research. More details and source code are available on Ai2’s blog.
Built on Ai2’s OLMoE model, it offers 35% better performance while using Q4_K_M quantization for efficiency. Developed with Llama.cpp, it achieves 41 tokens/sec on iPhone 16 Pro.
But what does all this tech jargon even mean?
TheWhiteBox’s takeaway:
For starters, the biggest takeaway is that this app does not require an Internet connection because it uses Llama.cpp to store the model inside your device. In other words, the language model, which is just a digital file, is stored in your device’s hard disk memory and loaded into RAM whenever you want to use it.
This means the deployment is privacy-preserving, meaning your data is never compromised because it never leaves your device. There are too many use cases of on-device AIs to conceive them all. Still, examples include a company executive using the model to write emails without fearing that the confidential data being shared with the model will ever leave the safe confines of its terminal.
They then apply FP4 quantization, which means that the models are stored in a half-byte-per-weight instead of the typical two-byte or one-byte implementations. This effectively divides the model’s size and KV Cache by 4, ideal for RAM-constrained environments like smartphones. In simple terms, we eliminate decimals of each parameter, losing some precision but requiring less data storage.

TREND OF THE WEEK
Has Apple Shown Us a Different Future for Robotics?

Apple has brought Pixar’s famous intro lamp to life, presenting a new piece of research where the adjectives used to describe it aren’t the ones you would usually use in AI research.
It’s not technically smarter, more powerful, or nightmare fuel for those afraid of AI.
Instead, it appears to be Big Tech’s first attempt at improving AI and robotics intelligence in a different direction. Instead of making robots more functionally smarter, Apple is focusing on emotion. Their newest technology—and quite possibly, their first robotic product—will be an emotion-showing lamp.
Yes, you read that right.
But before you dismiss the entire thing, bear with me. This is a beautiful piece of research that offers a refreshing perspective on how AI robots should be designed and gives crucial insight into Apple’s future plans.
Additionally, it will give you intuition as to why, ironically, robots will never have real emotions.
Let’s dive in.
From Luxo to Pixar to ELEGNT
Before you read further, please take two minutes—literally—to watch this 1986 Pixar animation. What’s wonderful about it is that both lamps (Luxo and Luxo Jr) can express so much emotion without uttering a single word, just through a series of movements that mimic how humans move and gesture when displaying emotion.
Well, that’s precisely what Apple is proposing, that we shouldn’t optimize robots solely to be functional. Instead, to make them truly appealing to humans, we should also train them to display emotion.
Look at the short video below, where Apple’s lamp inspects an object. If you look at the lamp's movements, most are functionally irrelevant to answering the question. However, those movements also help you guess what the lamp is doing (investigating the objects) and whether it has understood the assignment.
In other words, it’s using non-verbal cues to transmit more information about its actions.

The ELEGNT lamp can also perform a wide range of other tasks, such as following the user’s book, making creative projections, or playing with the user. It can also talk and project images into books and walls.

Not convinced as to why we should do this? Look at the following video, a comparison between an expressive robot lamp and a purely functional one.

As you can see, the expressive robot does several functionally irrelevant but highly emotionally expressive motions during the process.
First, it looks back at the user to signal they are listening.
Then, it looks at the goal to transmit that it has understood the assignment.
It tries, stretching itself several times, signaling that it’s trying to reach, but it can’t.
It finally shakes its head as a sign of defeat.
Which robot transmits its thoughts and actions better? Of course, some of you will find this completely unnecessary and want the robot to do what it has been ordered. (Don't worry; the expressiveness can be adapted, as we’ll see later.) Still, I find the expressive robot much easier to interpret, while the functional one, which also failed, is much harder to debug (to understand why it failed the task).
This is a simple example, but I hope it helps convey that creating an emotional lamp isn’t about bringing Pixar’s lamp to life ‘just because’; it’s actually more about building robots that humans are more comfortable interacting with. Furthermore, considering they are AIs, their movements might also yield greater transparency of their internal processes, which still remain pretty much black boxes.
Great reception, but adaptability is welcomed.
The reception of this new robot lamp wasn’t wholly positive, though. While the study found that emotionally expressive robots were generally well-received, especially in social tasks, where users rated them significantly higher in engagement, intelligence, and willingness to interact than purely functional robots, some users found excessive expressiveness inefficient, preferring straightforward actions for task-oriented scenarios, signaling that the 'emotionless nature’ of these robots should be customizable.
For instance:
Younger and non-roboticists preferred expressive robots more,
while older users and those with artistic backgrounds had higher expectations.
Overall, expressive robots enhance human-robot interaction, but they need to balance emotion with efficiency, especially in practical applications, so users should be allowed to manage the expressiveness of robots.
But how? And how do we train for such a thing?
Optimizing… Not to be Too Optimized
The best way to explain how ELEGNT was trained is by looking at the image below:

When striving to understand how an AI was trained, you must always understand the training goal.
In this case, the robot isn’t tasked with fulfilling its goal via the best and most optimized path but rather in a more emotionally driven way that suggests, via movements, what it has perceived and will do while also achieving the goal.
While most AIs aim to take the fewest actions possible to achieve the desirable outcome (most functional path), ELEGNT involves “additional steps” that enrich the entire experience. In other words, you aren’t training only for functional purposes but also trying to express emotion in your movements.
Mathematically, this translates to the equation below, in which the quality of the robot’s actions, its maximum utility, isn’t measured in functional terms but as a weighted sum of functional and emotional actions.
The ‘𝞃’ (tau) symbol represents the trajectory the robot takes, and the ‘Ɣ’ (gamma) is a balancing hyperparameter that lets you balance how much ‘weight’ the emotional actions have; the smaller this adaptable parameter is, the more the robot optimizes for functionality, and vice versa. Therefore, for the model to reach maximum utility, it isn’t enough to be maximally functional, as the overall utility decreases as it also has to maximize emotion display.

For those among my readers that are still unimpressed, the gamma symbol would allow you to clamp down expressiveness if you aren’t willing to deal with a lamp that seems to understand emotions.
That said, if being functionally optimized to the max was the best option, why aren’t we humans that way? Shouldn’t we just display zero emotion and do the more functionally correct thing? If humans evolved to display emotion as part of being ‘intelligence optimized,’ I definitely think AIs should too.
Regarding the training procedure, there are no surprises here.
Like most robots today, they treat the lamp as a Markov Decision Process (MDP), where each action is based on the current state, not prior ones. MDPs are much more computationally viable, as otherwise, the robot would keep a history of all previous actions and states, which in lengthy tasks can cause a combinatorial explosion that makes the robot too computationally expensive to run.
But is ELEGNT deep learning, aka a neural network?
Apple does not clarify, but due to the relatively small amount of actions and states the robot can be in, it was probably trained via Q-learning, a method where the robot learns a table of action-state rewards (action ‘x’ performed in state ‘y’ yields reward ‘z’) that the robot simply looks-up in real time to execute the next action (unlike deep learning, this is a rules-based approach, so no neural networks here).
And what does this tell us about Apple’s future… and ours?
Robotics with a Twist
While most Big Tech companies are software-based, Apple has the advantage of having unmatched hardware distribution; its products and brand are present in almost every known household. However, Apple undeniably lacks the AI software expertise of Microsoft or Google, so it needs to get creative.
For some time now, it has been clear that Apple is increasingly interested in everything related to smart homes. In addition to the Apple Homepod, Apple is reportedly releasing a smart home tablet very soon, and according to Apple insiders like Kuo, they are considering the release of both humanoid and non-human robots, like this lamp, as soon as 2028.
In other words, Apple is seriously considering becoming a robotics company.
Therefore, if home software and robotics are indeed in Apple’s mind, this lamp makes sense for them. Think about it: it could just be another excuse to have their AIs at your home at all times, always present and ready to help (and, of course, sell you stuff in the process).
If your home becomes Apple-driven, Apple becomes the door to solving most day-to-day concerns: groceries, intelligent use of lighting to save a few bucks, putting your favorite podcast or artist… all roads lead to them becoming omnipresent in your life, something both Amazon and Google are desperately trying to. For the most visionary among you, you can even consider this lamp as Apple’s next developer platform, where new apps that users can pay to use are created, generating new revenue sources for Apple.
Unsurprisingly, all Big Tech companies with any hardware presence are trying to invade your home, as whoever owns your smart home devices owns your door to the Internet.
Are Humans Prepared for this, Though?
As a final reflection, this research is an excellent way to visualize humans' desperation to anthropomorphize things. It’s almost like we want the lamp to be alive, to understand us, to feel us. And I can guarantee you that a few minutes of interacting with it would make you forget it’s actually just imitating patterns without having the slight feeling of emotion underneath.
What I like about this research is that it makes it strikingly clear that emotional actions performed by robots are no different from functional ones.
They have just learned that, to humans, nodding means ‘okay’ (in most cultures) and that putting its translucent head down is a sign of defeat. But that doesn’t mean the lamp is sad or feels defeated; it’s literally imitating human actions to convey human emotions because experience and reward training taught them so.
All things considered, while I wholeheartedly welcome ‘humanized’ robots into our lives, I can’t pretend not to be worried about those people who will confuse this into believing the lamp has emotions. It doesn’t and will never have, although I feel it’s a matter of time before someone falls in love with one of these things.
AIs won’t kill us, but it could make some people fall for them. But guess what? I would buy this lamp.
Would you?

THEWHITEBOX
Join Premium Today!
If you like this content, by joining Premium, you will receive four times as much content weekly without saturating your inbox. You will even be able to ask the questions you need answers to.
Until next time!
Give a Rating to Today's Newsletter |
For business inquiries, reach me out at [email protected]