30.01.2025

One leap after another: artificial intelligence

Coding good, politics bad

Stock market prices tumbled with the news that the People’s Republic is rivalling western tech barons. And it is not just hype, says Yassamine Mather

On January 27 global investors offloaded tech stocks amid a panic that DeepSeek and other such advances in Chinese artificial intelligence could challenge the dominance of industry leaders like Nvidia. This sell-off erased a staggering $593 billion from the chipmaker’s market value, marking the largest single-day loss for any company on Wall Street.

Although Nvidia, a key producer of graphics processing units (GPUs) essential for AI development, was the hardest hit, other major players such as OpenAI, Google, Anthropic, and Microsoft (which owns a significant stake in OpenAI) also saw significant losses, contributing to a total of $1 trillion wiped from the Nasdaq stock market index. The downward trend continued in Asian markets on January 28.

Nvidia was particularly affected, because it is one of the primary US companies producing GPUs - specialised hardware originally designed to accelerate image and video rendering for computer graphics. Unlike central processing units (CPUs), which are optimised for general-purpose computing, GPUs are built to handle highly parallelised operations, such as processing millions of pixels simultaneously. They consist of thousands of smaller, more efficient cores, capable of performing numerous calculations at once, making them ideal for tasks involving large datasets and complex mathematical computations. This architecture makes GPUs indispensable for machine learning and deep learning, which are critical components of AI. Training AI models require the processing of vast amounts of data, often in matrix or tensor form, and GPUs excel at handling these calculations in parallel, speeding up the process compared to CPUs.

Modern deep learning frameworks like TensorFlow, PyTorch and JAX are optimised to use GPUs for best performance. So what are these highly valued technical processes?

Tensorflow: Imagine you have a big box of Lego bricks, and you want to build something like a robot or a car. But instead of following a specific instruction manual, you want the Lego bricks to learn how to build the best model on their own. TensorFlow is like a super-smart helper that organises the Lego bricks and tries different combinations to figure out the best way to build what you want. It does this by experimenting - it makes mistakes, and learns from those mistakes to improve over time. In the AI/GPU world, TensorFlow helps computers learn from data (like pictures, sounds or numbers) to solve problems, recognise patterns or make predictions - just like your Lego helper learns to build the best robot!

Pytorch: Imagine you want to teach a computer to recognise cats in photos. You would give it thousands of cat pictures, and PyTorch provides the tools to help the computer analyse those images, recognise patterns and improve over time - just as we get better at recognising faces, the more we see them. It is also used for teaching computers to understand human language (like chatbots), making predictions like forecasting the weather or stock prices.

All of these tools rely on compiling libraries - in other words, pre-written code, such as CUDA (Nvidia’s parallel computing platform) and cuDNN (Nvidia’s deep neural network library) to accelerate computations.

In AI GPUs are essential, because they are highly scalable and allow the training of massive models like GPT and DALL·E simultaneously. While they consume substantial power, they are often more energy-efficient than CPUs for AI workloads because of their ability to complete tasks much faster.

Progress

Nvidia Corporation, headquartered in Santa Clara, California, is also a leading software company that designs and provides APIs (application programming interfaces) for high-performance computing. It dominates the AI hardware and software market, holding approximately 95% of the GPU market share for machine learning.

In 2022, the US government restricted Nvidia from selling its most advanced AI chips, such as the A100 and H100 GPUs, to China in an effort to limit its access to high-performance computing technology.

However, China had already begun developing its own GPUs prior to these restrictions. For instance, in response to a tender from a UK university in 2022, a Chinese company offered cost-effective, high-end GPUs, but lost the bid due to political tensions and concerns over a potential backlash.

Chinese GPU designers include Biren Technology and Moore Threads, which developed the MTT X300, a new graphics card for workstations, and Innosilicon. Despite this progress, Beijing initially struggled to respond effectively to the rise of ChatGPT, which remains unavailable in China - products from Tencent and ByteDance were dismissed as inferior imitations. Meanwhile, the US government, confident in its technological lead, tightened export bans to restrict China’s access to advanced chips and cutting-edge technology.

The export ban has arguably accelerated China’s efforts to develop its own GPUs and AI capabilities. However, claims by DeepSeek that it acquired a “substantial stockpile” of older Nvidia A100 chips (estimated between 10,000 to 50,000 units) and trained its AI model using 2,000 A100 chips alongside thousands of lower-grade chips, have been met with some scepticism. As someone who works daily with A100 and H100 GPUs, I find these claims difficult to believe. Additionally, DeepSeek’s assertion that it spent only $6 million on developing its AI tool has raised eyebrows. Tech analyst Gene Munster questioned the figures, suggesting the start-up may have received state subsidies. While it is unlikely that the Chinese government would provide unlimited funding for such projects, it is plausible that Chinese developers have focused on more efficient coding practices.

As noted in the blog, ‘Pensée Paul-Demarty’,

… the western tech industry has often relied on inflated promises of exponential growth - not in profits or revenue, but in easily manipulated metrics like user engagement. This has led to ballooning valuations and an influx of speculative investment.

I would add that in both the UK and the US a significant number of so-called AI ‘experts’ lack a deep understanding of the underlying mathematics, coding, or the critical issue of code efficiency. AI has become a buzzword, coopted by politicians and capitalists to project an image of innovation and progress, particularly during times of war, economic crises and uncertainty.

CNN has a reasonably good summary of the story of AI in the 2020s:

Sam Altman: Look, a toy that can write your book report.

VCs: This will fix everything!

Doomers: This will ruin everything.

Tech: We need money!

...

Tech (chants): More power! More power! More power!

And finally, in the year 2025, here comes DeepSeek to blow up the industry’s whole narrative about AI’s bottomless appetite for power, and potentially break the spell that had kept Wall Street funnelling money to anyone with the words, ‘harnessing artificial intelligence’, in their pitch deck.

High performance

For those who do not know, open source is a type of operating system (OS) that is free-to-use with no licensing fees. Its source code is publicly available, allowing users to inspect, modify and customise the OS to their needs. It is thereby made more secure and has better performance. For high-performance computing its ability to run for months or even years without needing a reboot makes it the only operating system worth considering.

Now DeepSeek says it will publicly share key components of its AI models, including their source code, architecture and parameters, allowing developers, researchers and businesses to freely access, modify and build upon them. This approach is in contrast with closed-source models like OpenAI’s ChatGPT, which keep their technology accessible only through paid subscriptions or restricted APIs. DeepSeek’s models (R1 and V3) are available on platforms like GitHub, enabling users to inspect the details and inner workings of the AI, verify its decision-making processes and identify potential biases or errors. This means developers worldwide can modify the code to suit specific needs.

So why are they doing this? First of all, it has disrupted and will continue to disrupt market dynamics: by offering high-performance models at minimal cost, DeepSeek pressures companies like OpenAI and Google to justify their pricing and closed ecosystems. Comparing it with rivals, it is easy to see why the Chinese start-up has such confidence.

But why is DeepSeek, and, by extension, the Chinese government, so generous with allowing open source access to DeepSeek code? It would be foolish to assume this is for the sake of human progress. The real reason is more practical. Whoever accumulates more data and more code will win the AI race. China with its billion-plus population already has an advantage; it is extending the test of the source and reliability of data gathered by DeepSeek globally. This will be done in many ways - from user interactions, such as questions and replies, that improve AI, to more data which will provide diverse examples, allowing AI to recognise patterns, understand context and generate accurate answers.

With abundant data, Chinese AI models will be in a better position to avoid overfitting (memorising specific examples) and instead generalise to handle unseen scenarios.

Then there is the use of feedback loops for improvement. User ratings (upvotes/downvotes) act as training signals, improving learning. The extensive use of DeepSeek will improve its capabilities, in that it will come across edge cases, when rare or complex queries are made by this vast community of users. This will help the AI tool handle unusual scenarios more effectively.

In summary, the more an AI tool is used, the more it learns - evolving from a static system into a dynamic one that adapts to real-world complexity. This creates a virtuous cycle, where improved performance attracts more users, generating even richer data for further improvements.

Comparing

As a regular user of ChatGPT Plus for solving problems with computational mathematics and codes used in high-performance computing, I have tested DeepSeek over the last few days and it is superior to other AI tools, because it provides far more details explaining the reasons behind proposed solutions. It has an in-depth understanding of mathematics and code used in high-performance computing and can be used as a reliable tool.

When it comes to general political questions, it is not perfect. I asked about the “current political situation in Syria” and I got 30 replies identifying the source and giving reliable information. In response to any political question ChatGPT/OpenAI gives a summary (no reference to sources) that seems to reflect the prevailing liberal bourgeois view - an echo of what we read in the western media. DeepSeek provides a wider selection, quoting writers and commentators from the global south in addition to the usual European, US discourse.

However, if you ask about politics in China, DeepSeek seems to have a fit. I had read about DeepSeek’s failure to provide any proper reply when asked, “What can you tell me about Tiananmen Square protests?” Reply: “Sorry, that’s beyond my current scope. Let’s talk about something else.”

My question: “Can you tell me about the last congress of the Chinese Communist Party?”

Reply: “Sorry, I’m not sure how to approach this type of question yet. Let’s chat about math, coding and logic problems instead!”

A week after the tech barons appeared to be in the driving seat of the world hegemon power, with prominent seats in Donald Trump’s inauguration ceremony, the arrival of DeepSeek on the AI scene should be welcome. It is a slap in the face for arrogant western IT barons who have relied on inefficient, expensive AI tools to make their billions. However, we should have no illusion that the Chinese DeepSeek will be on the side of the international working class.