AI Markets Were Deceived To Believe In DeepSeek &#039, s Low Training Costs, They Are Actually 400 Times Higher Than The Reported Figure

The market was shaken by the controversy surrounding DeepSeek’s R1 model training costs, but it appears there was a lot of deception given that the real figures are in fact surprising.

DeepSeek’s Training Fees Are Said To Be Drastically Higher Than The Documented”$ 5 Million” Number, They Have Access To High-End Hardware

The study firm SemiAnalysis has thoroughly examined the actual costs of DeepSeek in terms of training costs, refuting the claim that NVIDIA and other companies aren’t using R1 because it is so effective. Before we dive into the real equipment used by DeepSeek, this take a look at what the business first perceived. It was claimed that DeepSeek just utilized”$ 5 million” for its R1 type, which is on par with OpenAI GPT’s o1, and this triggered a financial stress, which was , however, now that the dust has settled, let’s take a look at the exact figures.

For those unaware, DeepSeek was said to be a side job of the Chinese hedge fund High-Flyer, and the review by SemiAnalysis says that they purchased 10, 000 models of NVIDIA’s A100 again in 2021, when export restrictions weren’t that extreme. DeepSeek then evolved into a separate entity since the family business, High-Flyer, decided to flip the task of, and that’s when things really took off. With that, they started accumulating computing resources, which we’ll discuss next.

Image Credits: SemiAnalysis

The report says that DeepSeek has around 10, 000 of NVIDIA’s” China-specific” H800 AI GPUs and 10, 000 of the higher-end H100 AI chips. Moreover, the firm has invested in NVIDIA’s H20 AI accelerators, and they have a “pool” of resources that are being shared between DeepSeek and High-Flyer for” trading, inference, training, and research”. This translates into approximately$ 1.6 billion in CapEx for DeepSeek, with operating costs rumored to be around$ 944 million. The figures are roughly four hundred times higher than what the markets were initially expecting.

image

For clarification, the initial figure is said to be a” specific part” of the training costs likely associated with running the final model. DeepSeek excelled at utilizing local talent, holding interviews at top local universities, and getting paid more than$ 1.3 million for specific employees. The “misreported” financial figures acted as a catalyst in last week’s black swan event, but the brains behind DeepSeek’s R1 model were indeed capable of coming up with an effective solution to compete with the likes of OpenAI.

You should definitely check out SemiAnalysis’s extensive testing of DeepSeek’s AI model because the details are intriguing.

Leave a Comment