Borne out of pro-Chinese government billionaire Liang Wenfeng’s hedge fund High Flyer, DeepSeek is a wakeup call for United States tech sector investors that have poured billions of dollars into it.
Using 2nd-tier GPUs from NVIDIA due to a U.S.-imposed trade embargo, China-backed DeepSeek's AI large language model (LLM) uniquely teaches itself; completely through trial and error or reinforcement learning (RL) - and without supervised fine-tuning (SFT).
And impressively, DeepSeek’s open-source AI R1 model matches OpenAI’s o1 ChatGPT service for an estimated 3-5% of the cost.
It claims training the latest model costs just US$5.6 million (A$8.95 million) to train. In contrast, the lion's share of other generative AI models’ training costs come in at somewhere between US$100-$1 billion - most of that cost is worn by expensive cutting-edge GPUs from NVIDIA.
“DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama,” said its developers.
Setting itself apart from the pack, the company’s R1-Zero model uses an innovative, pure and unsupervised reinforcement approach to learning in the preliminary stage, learning and improving through rewards or penalties, trial and error.

Is it a game-changer?
Questions, however, arise out of its utilisation of Llama, a comparative drop in the ocean ~$6 million investment (that’s 98% cheaper than OpenAI’s reported spend), and how it ramped up so quickly.
RBC Capital's Brad Erickson says that “while DeepSeek's progress may not alter the pace of AI investment near-term - even in spite of the headline <$6M training cost for its R1 model, it points to downside potential to recently announced targets like Stargate’s $500B of AI infrastructure over the next four years or META’s $60-65B of capex to be spent in 2025 alone”.
“Beyond just simple hardware pricing pressure, the question becomes how quickly can the U.S. tech titans learn and apply DeepSeek (and others) efficiencies and begin altering their infrastructure spending plans?"
Whatever the case may be regarding how much time or money was invested, the fact remains China and DeepSeek have arrived on an AI scene dominated by Silicon Valley - and it’s open source.
The timing of DeepSeek’s V3 and R1 model releases was no mere coincidence - one might even assume deliberate - by gaining traction just after freshly re-minted U.S. President Donald Trump announced a US$500 billion spending package for domestically-driven AI innovation.
Reason? America is in a tit-for-tat war with China for global superiority and it’s the tech sector where the battle is currently being fought.
“AI’s Sputnik moment”
Meanwhile, the rest of the AI tech universe is scrambling.
Reacting to the news, Marc Andreessen, one of the most influential tech venture capitalists in Silicon Valley, equated DeepSeek’s model to the Russians being the first to launch the human race into space.
Mark Zuckerberg's Meta has reportedly created a bunch of war rooms full of engineers to figure out how DeepSeek’s AI got to where it is at a fraction of the capital outlay by competitors, including how Llama can restructure models based on attributes of DeepSeek’s models.
Other responses from major tech companies welcome DeepSeek’s low-cost alternative to the status quo.
Former Intel CEO Pat Gelsinger said the release would expand the AI market instead of diminishing it, crediting DeepSeek engineers, who “had limited resources, and they had to find creative solutions”.
“Open Wins. DeepSeek will help reset the increasingly closed world of foundational AI model work.”
Microsoft CEO Satya Nadella referenced innovation and efficiency to drive AI demand on Linkedin using Jevons Paradox; and Meta chief AI research scientist Yann LeCun argued that open-source models are surpassing proprietary ones.
Yet AI unicorn Anthropic's boss Dario Amodei says DeepSeek's breakthrough is a sign the U.S. should maintain pressure on chip exports to China.
Talking to CNBC, Amodei believes AI companies from the Middle Kingdom have more capability in the sector than they're letting on - as many procured large swathes of hardware before restrictions came into effect.
Market reactions
The global tech sector tanked on DeepSeek's AI revelations, with the NASDAQ dropping 3% and the S&P 500 1.5% by close on Monday.
Tech's big kahuna's of Alphabet (parent to Google), Microsoft, Amazon, Oracle, Micron, ASML and others all took losses.
More broadly, the impact of a potentially super cheap generative AI model is extremely enticing for mid-tier businesses and could catapult the advancement and number of AI Agents that are expected to be 2025's hot trend for innovation to come to the fore.
Janus Henderson Investors’ Oliver Blackbourn says that the disruptive, advanced nature - and the timing of DeepSeek - of tech sectors could face a fairly gruesome week.
“The first credible threat to AI dominance among a small number of companies is [now] questioned,” Blackbourn said in a note.
“The emergence of a potentially more efficient approach to AI processing questions the need for the billions of dollars of expected investment in infrastructure and intellectual property.
AI is seen as a highly intricate field of expertise, where industry leaders hold technological advantages that will sustain accelerated growth well into the future.
Blackbourn says high expected earnings growth has been used to justify “very elevated valuations”, leaving them very exposed to disappointment.
“Competition always looked like the biggest threat but also the hardest one for investors to assess and the market reaction to a perceived sea change in the competitive landscape is proving vicious.
“If we start to see valuations on U.S. stocks drop significantly, there is a danger that this propagates out to other high-valuation areas in Europe and Asia.
“Similarly, with U.S. consumers more exposed to stock markets than ever before, there is the danger of wider negative feedback loops if there's a loss of consumer confidence.
“A significant drop in financial conditions indicators due to stock market losses could change the Fed outlook quite quickly."