The DeepSeek Revolution: Architecture, Economy, and the New AI Order
Lecture 1

The New Challenger: Who Is DeepSeek?

The DeepSeek Revolution: Architecture, Economy, and the New AI Order

Transcript

A single AI model, trained for roughly $5.58 million, just erased over $400 billion in Nvidia's market value in a single trading day. That is not a typo. Researchers at institutions tracking AI compute costs, including analysts cited by the Financial Times, confirmed the figure — and the number hit the industry like a thunderclap. The lab behind it was not OpenAI, not Google DeepMind. It was DeepSeek, a Chinese AI research group that most of Silicon Valley had barely heard of. DeepSeek did not emerge from nowhere, Yunying. Its parent company is High-Flyer Quant, a prominent Chinese hedge fund with deep expertise in large-scale computation. High-Flyer had already built a proprietary supercomputing cluster called Firefly, originally designed for quantitative trading. That infrastructure became the launchpad for a serious pivot into frontier AI research. This is a critical detail: DeepSeek did not need to buy its way into the compute race. It already had the hardware, the capital discipline, and the engineering culture to move fast. The models that shocked the world were DeepSeek-V3 and R1, released in early 2025. For context, training runs for models like GPT-4 are estimated to cost hundreds of millions of dollars. DeepSeek-V3 reportedly cost $5.58 million. Same competitive performance tier. Drastically lower cost. That gap is not a rounding error — it is a structural statement about how DeepSeek approached the problem differently from the ground up. The architectural secret is something called Multi-head Latent Attention, or MLA. Standard Transformer models — the backbone of most large language models — process attention across many parallel heads simultaneously, which is powerful but computationally expensive. MLA compresses the key-value cache used during inference, slashing memory overhead and enabling far more efficient processing without sacrificing output quality. This is not a minor tweak. It is a fundamental rethink of where compute gets spent, and it is why DeepSeek could do more with fewer high-end GPUs. The industry response was immediate and visceral. When DeepSeek's models dropped, Nvidia shares fell sharply, producing that historic single-day loss exceeding $400 billion in market capitalization, as reported by Reuters. Commentators across the technology sector began using the phrase "Sputnik moment" — a reference to the 1957 satellite launch that forced the United States to reckon with a competitor it had underestimated. The analogy holds, Yunying, because the shock was not just technical. It was psychological. The assumption that raw GPU scale was the only path to frontier AI had just been publicly challenged. Here is what this all adds up to, and why it matters for you specifically: DeepSeek proved that architectural innovation and cost discipline can outmaneuver brute-force hardware spending. The competitive landscape for AI is no longer defined solely by who has the most chips. It is defined by who builds the most efficient systems. DeepSeek moved from underdog to global reference point not by outspending its rivals, but by outsmarting the assumptions they were all building on. That is the revolution this course is about.