Dailymaverick logo

Sci-Tech

Sci-Tech

Crossed Wires: DeepSeek, hand-wringing, hysteria and the changing geopolitics of technology

Crossed Wires: DeepSeek, hand-wringing, hysteria and the changing geopolitics of technology
DeepSeek released a chatbot called DeepSeek-R1 on 20 January (a nimbler, cheaper cousin of their 2024 DeepSeek-V3). They claimed it cost them under $6m to train. Given that AI elephants like OpenAI and Anthropic have spent billions on training, it was an extraordinary announcement.

One of the biggest tech announcements in decades came hurtling like a comet out of the dark last week and crashed to Earth with a big boom and a lot of geopolitical sturm und drang. Journalists, bloggers and posters everywhere thanked the gods of the Fourth Estate and dashed to their laptops to try to understand what had just happened, to analyse and opine.

So did I, but other deadlines intruded, preventing me from a hasty filing. Good thing too, because it turns out the myriad effects of this continue to ripple and morph from Silicon Valley to the tech hubs of Shanghai, from the central committee of the Chinese Communist Party to the halls of power in Washington.

I am talking, of course, about the announcement of an AI called DeepSeek-R1, made and distributed by the Chinese company DeepSeek.

Here is how I imagine the news might have broken:

OpenAI CEO Sam Altman’s phone rings at 2am on 20 January.

Sam (groggily): Hello?

OpenAI minion: Hi Sam. Sorry to call so late. We might have a problem.

Sam: It’s 2am for god’s sake. I’ve been dealing with damn half-trillion-dollar data centre plans all day. Can’t this wait?

OpenAI minion: Um, it’s kind of a big problem.

Sam (sighing): Go on.

OpenAI minion: Remember that Chinese company that was trying to build an AI?

Sam: Who? The TikTok people? Alibaba?

OpenAI minion: DeepSeek.

Sam: Say again?

OpenAI minion: DeepSeek.

Sam: Rings a bell, I think. Something to do with hedge fund trading or something?

OpenAI minion: They released a version today. A general chatbot.

Sam: Not another one. So what?

OpenAI minion: It’s as good as ours.

Sam: Yeah, sure. Government funded?

OpenAI minion: Private.

Sam: Where did they get the money?

OpenAI minion: Um, here’s the thing. It cost them under $6-million.

Sam: What cost them six million? The web front end?

OpenAI minion: The training.

Sam: Say that again, please?

OpenAI minion: It cost them six million to train the model.

(There is silence.)

OpenAI minion: Sam, are you there?

OpenAI minion: Sam, have you lost the connection? I can’t hear you.

Sam: Six million? SIX MILLION DOLLARS? ARE YOU FUCKING KIDDING ME?

OpenAI minion: Oh, and the code is open source, free.

(Sounds of hyperventilating from Sam.)

Sam: I thought Nvidia H100 chips were banned in China? Did they steal the chips or something?

OpenAI minion: They used the lesser chips.

Sam: THEY USED WHAT?

OpenAI minion: The older ones. The H800s.

(There is silence for a beat.)

Sam: Excuse me, I have to go to the bathroom.

(Sounds of vomiting can be heard and then the toilet flushing.)

Sam (back to phone): I am very tired. Why is my arm numb?

OpenAI minion: What should we do?

Sam: Get me Satya on the line.

(Sound of dialling.)

Microsoft CEO Satya Nadella (groggily): Hi Sam. It’s late. I’m asleep. Can we talk tomorrow?

Sam: Not really.

The facts 


Before we get to the repercussions of all of this, here are the top-line items further explained.

DeepSeek released a chatbot called DeepSeek-R1 on 20 January (a nimbler, cheaper cousin of their 2024 DeepSeek-V3). They claimed it cost them under $6-million to train. Given that AI elephants like OpenAI and Anthropic have spent billions on training, it was an extraordinary announcement. The chatbot was quickly tested by independent experts and it became clear that R1 is pretty much on a par with the best AI models. Unsurprisingly, a bunch of tech billionaire CEOs had simultaneous heart attacks.

The $6-million quoted was a little disingenuous — that was the price of the final training run. The cost of earlier preparatory work was not disclosed. There were also suspicions that DeepSeek had illegally “distilled” training IP from OpenAI, although there are also suspicions that everyone training large language models (LLMs) secretly does that. Even so, the cost appears astonishingly cheap.

It got worse for the AI incumbents. The mega-chips that power all the AI ecosystems designed by the big companies are made by Nvidia, a company that has risen in a matter of decades from a quirky videogame hardware accelerator manufacturer to one of the most valuable tech companies in the world. Within 24 hours last week, nearly $600-billion was wiped off Nvidia’s market cap. That’s $600-billion, not million — the largest single-day value plunge in stock market history.

It got worse again. It turns out that the export to China of Nvidia’s top-of-the-line chips (the nerdishly named H100s) has been banned since late 2022 because, well, the US doesn’t trust the Chinese with advanced US tech. It was happy for the Chinese to buy the less capable H800 chip — they could play nice with those.

It seems they played very nice indeed.

Oh, and one last thing. This product was built by an offshoot quant trading company. AI was not even its main business. To add insult to injury, DeepSeek-R1 is both open source and free, including the critical “weights” which are the core intellectual property of any LLM. This is not the case with OpenAI’s offerings.

We will get to the “how” in a minute, but let’s first look at the short-sightedness of the chip ban. It forced the Chinese to innovate. The US did not for a second believe that the Chinese could innovate their way around the chip ban, at least, not quickly. How could the Chinese possibly outsmart the best technical minds in the US — the chip designers, the best-of-the-best software developers from Stanford, the stratospherically salaried AI tech theorists from MIT?

The US has made this mistake before. Since the 1970s, it has restricted the export of encryption algorithms, precision machine tools, chip-making technologies, satellite technology and other technical wizardry. In every case, this simply spurred foreign companies to build their own. In some cases, it led to the US losing market share within a few years.

It could be argued that these bans had a more dangerous secondary effect — they encouraged complacency among the US companies that were protected from Chinese competition because of the export bans. There are better ways to compete than by assuming others are not smart enough to find ways to cross your government-constructed moat. It could be argued that, had the chips not been banned, the Chinese would not have been inspired to innovate as they have done.

And now the rules of the global AI arms race have completely changed. We’ll get back to that in a moment but, first, the tech.

The tech 


As to how they achieved R1, there was no single trick. DeepSeek innovated on multiple fronts, including chip optimisation and the development of new approaches to machine learning and inference. For instance, the programming language provided with Nvidia chips is called Cuda. Languages like these generally eschew some functionality in favour of ease of use, particularly at the gnarly deep levels of the chip.

DeepSeek did not use Cuda for parts of their optimisation — they dropped to a native chip language called PTX. Harder to wrangle, but perfect for chip optimisation, especially for things like sharing memory among multiple cooperating chips.

A second set of tech inventions was concerned with splitting up the machine learning software “brain” into many “mini brains”, each with specific areas of expertise. Loading up one mega brain requires lots of memory. Loading in only the experts you need for a given task is much more efficient.

A third clever idea was the issue of the “next word”. The big models work by predicting the next word (or “token”) while building their response to a prompt. DeepSeek wondered if it might make more sense to predict two different tokens and then see which one was a better fit. It apparently made a huge difference in the performance of the model.

There was more, of course, but at a technical level of detail that requires deep and eye-glazing expertise. The important point here is that the DeepSeek team applied innovation broadly across the entire technical ecosystem, looking into every nook and cranny to tweak and improve performance and cost efficiency.

The repercussions 


In any event, many questions now arise. Is it the end of US hegemony in AI? Yes, definitely, and it is a measure of the conceit of US exceptionalism that those who should have known better thought otherwise.

Will it supercharge the confidence of Chinese and other non-US players to nurture their best and brightest to work on the technologies of the future? Again, yes. A Chinese researcher opined that it is exactly what was needed for many of his colleagues to shed the false sense of inferiority that has bedevilled them.

Will the US lose its lead and surrender ground to China? No, definitely not. It will replicate the Chinese innovations by the end of the day. AI war rooms are now ablaze with urgency all over Silicon Valley to prevent this from happening again. With the US AI industry on a war footing, innovation and invention will be seriously juiced. Capitalist incentives have given the US a good track record in this area.

Will the eye-watering $500-billion for a Texas-based AI infrastructure play recently announced by US President Donald Trump quickly become an embarrassing boondoggle in the face of DeepSeek’s thrifty new offering? Maybe not quite a boondoggle, but I expect that the investors in this initiative will be asking some hard questions. The numbers may well change but compute is king, so it will not go away.

Is this tantamount to a revolution in AI? No, not really. It was an incremental improvement, even if the increment was wildly impressive. It was not, as they say, a paradigm shift. And for those who claim the whole thing is an elaborate Chinese fake, nope — the researchers had published a slew of impressive publicly available papers in the run-up to the launch.

Will users flee ChatGPT and the others for a new home at DeepSeek? Well, consider that a query to DeepSeek about Tiananmen Square returns a stony silence. The Chinese are not enamoured of free expression and freedom of information so don’t expect DeepSeek to be without its information lacunas and flaws. So, no, I don’t think we will all be turning away from the current crop of Western offerings. Besides, most people are a little distrustful of China’s commitment to user privacy. (As this article was being written, Italy banned the use of DeepSeek for this very reason).

Thirty years ago, China was good at the grubby business of reliable mass manufacturing. It is now on a par with, or ahead of, the rest of the world in EVs, batteries, lidar, drones, robotics, smartphones, space exploration, aerospace, semiconductors, computer vision and automobiles. All are top-drawer and less expensive than Western competitors. And now, with the DeepSeek announcement, artificial intelligence, potentially one of the most transformative new technologies ever developed (there are also at least four other large Chinese generative AI offerings in China, but DeepSeek broke new ground on more than one front, including cost).

China’s tech strides over this period are, from many perspectives, shocking to the West. After all, its political culture, freighted with conformity and repression, was not expected to produce such advances. But it has, and the US can no longer expect to stay in the lead in many areas where it was once believed to be unassailable.

A recent Guardian article went further, suggesting that we are witnessing the end of the US empire and that Trump is a symptom, no more than a tragic, impotent and reactive shriek of outrage as he promises to bring the US back to its previously vaunted position of excellence, the envy of the world. He won’t do it, the article argues. No one will. It is over.

I’d prefer not to believe that, but DeepSeek is a dangerous harbinger. The one who wins AI wins everything, and nothing is certain about that race any more. DM

Steven Boykey Sidley is a professor of practice at JBS, University of Johannesburg, columnist-at-large for Daily Maverick and a partner at Bridge Capital. His new book, It’s Mine: How the Crypto Industry is Redefining Ownership, is published by Maverick451 in SA and Legend Times Group in the UK/EU and is available now.

Categories: