The Week of the Giants

For business inquiries, reach me out at [email protected]

THEWHITEBOX
TLDR;

  • 😟 A $200/month ChatGPT?!

  • 📹 Google Starts the AI Video Race

  • 🇨🇳 China Claps Back with Tencent’s Hunyuan

  • ⚡️ Meta Enters the Nuclear Race

  • 🤨 Over Half of LinkedIn Posts are AI-Generated

  • 🇺🇸 Trump Appoints Big-Tech Hawk to the DoJ

  • [ARTICLE OF THE WEEK] Amazon’s Great Week

There’s a reason 400,000 professionals read this daily.

Join The AI Report, trusted by 400,000+ professionals at Google, Microsoft, and OpenAI. Get daily insights, tools, and strategies to master practical AI skills that drive results.

NEWSREEL
A $200 ChatGPT?!

Yes, you read that correctly. OpenAI has just announced today a new $200/month tier named ChatGPT Pro, that will include access to the just-released o1 model, which is finally available to both Pro and Plus users (the latter being the $20/month subscription we were all in).

The difference is that the former will give you access to a more powerful o1, in the sense that OpenAI will allow it to think for longer. Funnily enough, one of the best things of this release (and one of the points highlighted by OpenAI) is that o1 is much more concise in its thinking, aka less wordy. LLMs are usually unbearably annoying and unnecessarily long in their responses, so this is beyond great.

TheWhiteBox’s takeaway:

It’s too early to jump to conclusions, but boy, do those $200/month subscriptions look expensive. Personally, I’ve still have to find a single reason to use o1-preview or o1-mini on a daily basis, most of the tasks they are presumably helpful are ones I prefer to do so myself (also because I simply don’t trust their outputs).

A 10x in cost should yield a 10x improvement in this so-called ‘intelligence,’ but that seems outrageous. Probably, the reality is that they are making you pay a huge premium to access the top tier model while making their business model less unprofitable.

The issue? China is coming strong, and as long as Chinese labs keep closing the gap with closed source models by releasing amazing open-source models, there’s a non-zero chance that OpenAI will never make sense from a business perspective.

NEWSREEL
Google Starts the Video Race

While many start-ups are trying to make a name in the AI-generated video scene, with examples like Pika Labs, Runway, or Luma, Google is the first of the big guys finally releasing their video generation model, Veo.

Developed by Google DeepMind, is a video generation model that creates high-definition videos from text or image prompts, offering a variety of cinematic and visual styles. It ensures consistency and coherence, enabling realistic movements of people, animals, and objects within the generated footage (allegedly).

TheWhiteBox’s takeaway:

AI-generated video is undoubtedly the most impressive Generative AI technology. However, to date, it seems more like a product looking for a problem to solve than one that actually responds to real user needs.

The main issue is inconsistency; they can generate small videos, but these are often easily identifiable as ‘AI-generated’ due to a clear lack of world understanding (they do not fully adhere to the physics of the real world). That means they require extensive engineering from those using them, which means several tries and edits that can quickly increase costs, meaning that the only few people who may manage to use these models profitably will be rich YouTubers or movie production companies, and only for very specific cuts of films/series.

Knowing how hard it is to adopt LLMs, which can hide their inefficiencies much better as they only generate text, I do not foresee a revenue explosion from these models anytime soon.

However, I’m much more optimistic about their future role as world models. Video models can perceive the world “as it is,” ingesting visual, audio, and text data all in one, making them much more serious candidates to being the engines of the AIs of the future than an LLM who only sees the world through text.

VIDEO GENERATION
China Claps Back with Hunyuan

In the same week Google and OpenAI (probably tomorrow) finally released their video-generation models, Tencent, a Chinese corporation, released Hunyuan. This new open-source video-generation model is on par, if not superior, with the former two.

The model is fairly small (13 billion parameters) but performs well, improving over Runway’s GenAlpha3 or Luma AI 1.6 model version.

TheWhiteBox’s takeaway:

It’s been months since we first called this, but there’s no way around it by now: China is here. Importantly, they are taking the good side in their AI strategy by going full open-source.

The reason why the Chinese government is pushing for open-source so heavily, which we will discuss in detail on Sunday, is that in over a year, they’ve closed a multiple-year gap with models from the US and Europe simply by letting their researchers build freely… the opposite of what many AI regulations in the West are doing.

ENERGY
Meta Enters the Nuclear Race

Meta is joining the nuclear frenzy driven by—forecasted—AI demand. They are exploring nuclear energy to power its AI data centers, seeking to establish 1-4 GW of nuclear capacity in the U.S. by the 2030s.

This aligns with a broader trend of tech giants turning to nuclear power for sustainable, reliable energy to meet AI-driven demand. Meta also announced a $10 billion investment in its largest AI data center in Louisiana, further proving their dedication to scaling their AI efforts.

TheWhiteBox’s takeaway:

The scale of the numbers is staggering. Achieving 4 GW of AI demand would be equivalent to Meta providing AI services that consume more electricity than four San Francisco combined.

Four GigaWatts of power requirements would consume, in a year, 35 TWh of electricity, the equivalent of Ireland’s entire energy consumption in 2021. GPU-wise, that would mean a total number of GPUs (H100 equivalents) of 2.8 million GPUs.

These calculations are assuming the current levels of Thermal Design Power consumption of state-of-the-art GPUs of 700W(that number isn’t falling, by the way, as NVIDIA’s new GPUs, Blackwell, require the exact same power as their predecessors), and assuming all other equipment (cooling, networking, etc.) to double that number to 1,400 W (SemiAnalysis estimates).

The exact number of Blackwell GPUs equivalent to the values above would have to be calculated in a different way because they connect many more GPUs together per rack.

Either way, those are some insane numbers, but I would love to know where they are obtaining those demand estimations from; current adoption isn’t anywhere near those numbers, and there’s little indication today that will change anytime soon unless LLMs/LRMs become more robust (which they aren’t).

AI-GENERATED CONTENT
Over Half of LinkedIn Posts are AI-Generated

A study by Originality.AI analyzed 8,795 LinkedIn posts exceeding 100 words, revealing that as of October 2024, 54% of these long-form posts are likely AI-generated.

This marks an 189% increase in AI-generated content since the launch of ChatGPT in late 2022. Additionally, the average length of these posts has risen by 107%, indicating that AI tools may enable users to produce longer content more efficiently.

The surge in AI-generated content on LinkedIn aligns with the platform’s integration of AI features. LinkedIn has introduced AI-powered tools to assist users in crafting posts, profiles, and messages, aiming to enhance user engagement and streamline content creation.

TheWhiteBox’s takeaway:

Unsurprising yet terrible news. If you read this newsletter, you know by now that AI-generated content is terrible, average at best. That said, its ease of use is undeniable, leading to many “content creators” choosing the easy path to farm engagement.

The Internet should be a place to share novel ideas or things that really mean something to you. Today, it’s mostly a place to generate as much content as possible to earn more money.

On the flip side, this leads to a more crowded, low signal-to-noise ratio (most content is rubbish) and, quite frankly, unbearable experience. This is why I believe so many people are going into newsletters in search of more meaningful content (that said, most AI newsletters are an exact extrapolation of this very same problem).

I am biased, but I feel paywalled content is one of the best ways to incentivize the writer to generate high-quality, thought-provoking content; free content is free, but in most cases, unamusing (and now, full of ‘underscore,’ ‘delve,’ and other insufferable words LLMs (or their human labelers, actually) can’t get enough of).

GOVERNMENT
Trump Names Hawkish as Top Antitrust Official

President-elect Donald Trump has appointed Gail Slater to lead the Department of Justice’s (DOJ) antitrust enforcement, which includes ongoing lawsuits against Google and Apple.

Slater, a former Federal Trade Commission (FTC) lawyer and adviser to Vice President-elect JD Vance, is known for her critical stance on the influence of major tech companies. Previously, she lobbied for tech firms like Microsoft, Google, and Amazon.

Slater’s role will involve addressing concerns over monopolistic practices in the tech sector. Trump highlighted her mandate to rein in tech giants, stating they have stifled competition.

TheWhiteBox’s takeaway:

You can already feel the influence that JD Vance has on President Trump (unlike VicePresident Pence). As we covered in our deep dive on Trump’s upcoming Administration stance on AI, JD Vance is notoriously anti-Big Tech (as so are two of the largest campaign donors and long-time patrons of the VP-elect, billionaires Peter Thiel and Marc Andreessen), so this decision is far from surprising.

It’s too early to say, but if this Administration takes a more open approach to AI, which, by the words of the VP-elect, seems like it, that’s already a good thing.

TREND OF THE WEEK
Amazon’s Great Week

Finally, this week, Amazon officially became a strong player in the frontier AI scene, all the while making the lives of those trying to make some money from this business even harder.

And while it might not seem like it, one company that will be nervous about this release is… NVIDIA.

Let’s review every detail.

Amazon Nova

Amazon Nova is a suite of state-of-the-art foundation models developed by Amazon to compete with the likes of Claude 3.5 Sonnet, GPT-4o, Gemini 1.5 Pro, Grok-2, o1-preview, and Qwen QwQ, those considered the best Large Language Models (LLMs) and Large Reasoner Models (LRMs) in the game.

A New Great Family

The Amazon Nova family consists of six main models (all descriptions according to Amazon):

Nova Premier is unreleased. Source

  1. Amazon Nova Premier: The family’s most powerful model, fully state-of-the-art; Amazon directly suggests using it mainly for distillation (generating high-quality data to train more affordable models) or very complex tasks, so it’s presumably huge.

  2. Amazon Nova Pro: A high-performance multimodal model optimized for a balance of accuracy, speed, and cost. It excels in video understanding, visual question answering, and function calling. Supports long context inputs and agentic workflows for complex task automation.

  3. Amazon Nova Lite: A lightweight multimodal model with faster processing capabilities at a lower cost. It is better for tasks requiring high-speed text, images, and video analysis, while maintaining competitive performance on benchmarks.

  4. Amazon Nova Micro: A text-only model for low latency and cost, tailored for tasks requiring fast and efficient language understanding and reasoning.

  5. Amazon Nova Canvas: A diffusion-based image generation model for creating high-quality images. It offers features like text-to-image generation, inpainting, outpainting, and background editing, with extensive customization options.

  6. Amazon Nova Reel: A video generation model capable of producing 6-second high-quality videos. It supports text-to-video generation, video creation from reference images, and advanced camera motion controls.

A very interesting point of this approach is that while most AI labs release a family of models that are the same but at different sizes and performances (like Meta’s Llama releases), here, Amazon releases models with a clear intention of solving different problems. In other words, each model is optimized for specific use cases.

But, to me, the show's stars will be Nova Micro and Nova Lite. Here’s why.

The Big-to-Small Distillation Cycle

I have very strong opinions about Generative AI deployments. While most companies look no further than from ChatGPT-4o or Claude 3.5 Sonnet at best, the reality in 2025 will be much different: it’s what Amazon is hinting at with this release: the distillation-to-production cycle.

The best thing about this release is that Amazon gets it. Despite being an extremely closed release (model weights aren’t released, and neither is the training data), Amazon’s go-to-market strategy is actually on point; they are clearly focusing on allowing people to distill performance from larger models to smaller ones through fine-tuning, the unequivocally best cost/quality approach to generative AI production deployments.

In other words, they’ve prepared this release with the clear goal of driving people through the following pipeline:

  1. Try the use case in a developer environment with Nova Premier (once released) or Nova Pro for validation.

  2. Generate synthetic data with these models.

  3. Fine-tune Nova Lite or Nova Micro (distillation is native to the platform, so it should be a breeze)

  4. Deploy alongside native integrations with Amazon BedRock’s

  5. Gather production data and go back to step 2. Repeat.

Anthropic has just announced they are offering distillation services for Claude Haiku from Claude 3.5 Sonnet through the AWS platform, where they promise Claude 3.5 Sonnet level performance on Claude Haiku cost, proving that Amazon just gets it.

This is how most Generative AI deployments should be. You get the performance of a large model for the cost of a small one. At this point, you may think there’s too much risk in using the small models; indeed, distillation doesn’t make a small model as good as a large one, right?

But here’s the thing: Just like with OpenAI’s GPT-4o-mini, Gemini Flash, or Claude 3.5 Haiku, small language models are already scarily good and unfathomably more ‘Pareto optimized’ (the quality per unit of cost is much, much superior) than their larger counterparts.

Just check the following table.

Amazon’s Nova family isn’t only on par with the state-of-the-art, but if you look at the small models, they offer around 85-90% of the performance of their larger peers, while being much cheaper and faster to run (sometimes an order of magnitude). I’m sorry, but if I see a client deploying large models in production, I’m immediately skeptical. Albeit some particular cases, you are doing something wrong, period.

Amazon’s model card includes many other tailored evaluations of Nova models in several categories like agentic benchmarks, software engineering, finance… offering all very compelling results you might want to check out.

But Amazon has much more to celebrate; this is also a hardware victory, too.

The Strategic Importance of Trainium

Trust me, no one is madder at NVIDIA’s 55% net profit margin than their customers, which, as you may guess, includes Amazon. Everyone wants to free themselves, and competition is fierce.

Just like Google, Amazon has been making strides in advancing its chip design capabilities (although Google is deeper into this journey) through the Trainium and Inferentia chips, which Anthropic is already extensively using.

The Nova models were trained using Amazon’s Trainium1 chip and, although they also used NVIDIA’s A100 and H100 accelerators, it’s nevertheless a massive victory for them, who can progressively alleviate their dependence on NVIDIA’s products, which are not only expensive but extremely supply constrained (which increases the price even more).

Besides Trainium chips being less energy demanding (although less powerful), to make matters worse for NVIDIA, their biggest stronghold is AI training... which is steadily losing its dominance in the balance of compute to the detriment of inference, mainly thanks to LRMs which can ‘think for longer’ for every user’s query in real-time, increasing the compute requirements.

However, for inference (running the models), there are already much more efficient offerings to NVIDIA’s through more custom-made hardware like LPUs—just try it for yourself here—RDUs, and sooner than later, Etched’s Sohu.

If the balance of compute moves into inference as LRMs become more widely used (which seems to be the overwhelmingly clear option), NVIDIA could be in serious trouble.

On the flip side, NVIDIA knows this, which is why the largest improvement with the new platform, Blackwell, is coming at the inference level (9x better inference compared to the H100) by stacking many more GPUs on a single rack connected to the same high-bandwidth equipment (NVLink and NVSwitch). Notably, all 2025 Blackwells have already been sold.

So don’t be so quick to count them out.

The issue with Blackwell is that it basically requires ad hoc data centers build for them, reducing the number of potential customers to just a handful. And all of these are doing what Amazon has just done: aiming to reduce their dependence on NVIDIA. 

TheWhiteBox’s takeaway

There’s no way around it; it's a massive win for Amazon. Often seen as a lagger in terms of frontier development, the only reason why it wasn’t viewed as bad—in AI terms—as Apple (the other Big Tech lagger) was its investment in Anthropic (which they just doubled down on).

But in one single blow, they are in the game as much as any other.

  • They have the desired compute (largest cloud in the world, although Google has way more AI accelerators (GPUs et al),

  • a top-level frontier AI research lab they basically own in Anthropic,

  • a good foundation model that is proprietary to them (reducing dependency on Anthropic),

  • and now their hardware efforts are finally paying off.

As one caveat, they are very closed, probably the closest AI lab of them all. But that aside, may I insist that Amazon gets it? They know their platform needs to be tailored for fine-tuning LLMs in the distillation-to-production cycle, and that in itself could be a massive catalyst for the adoption of Amazon’s AI Services in 2025 once companies realize that fine-tuning is the way to go.

Deploying a prompt-engineered GPT-4o in production is cute, but not what your boss will expect from you in a few months. And Amazon gets it.

THEWHITEBOX
Premium

If you like this content, by joining Premium, you will receive four times as much content weekly without saturating your inbox. You will even be able to ask the questions you need answers to.

Until next time!