All the tech world continues to be abuzz as it woke yesterday morning to the release of the DeepSeek-R1 family of AI language models, funded and created in China. At one point during the day yesterday tech stocks had plummeted – only to return to more or less normalcy (whatever that means) today.
Has the AI bubble burst, as many (including me) have predicted? Is Microsoft’s AI strategy in jeopardy?
I don’t think so. But let’s dive into the details to fully understand its ramifications.
Planning and Reasoning
DeepSeek rivals the most advanced AI models (such as OpenAI’s o1) in its ability to plan and reason in order to solve a problem.
Consider this problem. Notice that DeepSeek begins with a <think> step in which it plans its attack on the problem:
>>> what is 2*10 divided 4 with pi added to it?
<think>
First, I need to calculate the product of 2 and 10.
Next, I will divide that result by 4.
Finally, I’ll add the value of π (pi) to the quotient obtained from the division.
</think>
Then it executes its plan:
1. **Multiply 2 and 10:**
2 \times 10 = 20
2. **Divide the result by 4:**
20 \div 4 = 5
3. **Add π (pi) to the quotient:**
5 + \pi
Since π is approximately 3.1416, we have:
5 + 3.1416 = 8.1416
**Final Answer:**
{8.14}
At first blush it seems remarkable; DeepSeek’s methodical thinking process at least appears to closely mimic how humans think (very similar to how OpenAI’s o1 model functions).
But it’s worth pointing out that DeepSeek is not a from-scratch LLM – it’s a distillation of other foundation models (Meta’s Llama and Alibaba’s QWEN). Essentially DeepSeek was “taught” by them – meaning DeepSeek did not need the vast resources that their pedagogues required.
DeepSeek is Open-Source
Importantly, DeepSeek’s models – there are six of them varying by number of parameters – are freely downloadable; all but the largest fit comfortably on a decent-size desktop (the 14 billion parameter model occupies about 9GB on my machine).
Moreover, DeepSeek is offered under the permissive MIT License (“…without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software…” ) which means that developers can use it in applications pretty much any way they want. Purportedly, even the model weights are available, although I haven’t found them yet (DeepSeek’s GitHub repo, created yesterday, is here.) And finally, the DeepSeek team has published a comprehensive paper outlining their technical methodology – meaning that anyone, in theory, can reproduce their work (apparently there is work in progress to do just that).
Microsoft and DeepSeek (and Other LLMs Generally)
But how does it affect Microsoft and its strategy?
It’s no secret that LLMs in their short lifetime have become commoditized, a fact Microsoft has wisely recognized. Since ChatGPT was announced just a little over two years ago, dozens of foundation and frontier models have appeared; the LLM repo HuggingFace now offers well over 1 million fine-tuned large and small language models.
In fact, models form just one part of Microsoft’s overall strategy, which comprises a far more expansive and inclusive view of AI in the enterprise. For Microsoft, the true value of AI lies in the myriad applications it can power – and that developers can build using it.
Microsoft’s vision for the AI-powered enterprise includes providing user interfaces to LLMs connected to corporate data (Copilot); offering a wide assortment of LLMs for developers to make use of; deep set of AI-focused tools for developers to use (AI Builder, AI Foundry, and ML Studio); and lastly, providing access to the ”fuel” that powers enterprise AI applications, data, including productivity data (Microsoft 365), analytical data (Fabric), and corporate applications through connectors.
Political Ramifications of DeepSeek: Another TikTok?
It’s unlikely that the current US administration will, or can, block the use of DeepSeek as it nearly did with TikTok. By now DeepSeek has been downloaded to millions of computers (including, as you’ve seen, mine); blocking it as a purely technical matter will be close to impossible.
Nevertheless, because DeepSeek originates in China, geopolitics cannot be ignored. I asked it point-blank if China’s ruler Xi Jinping is a dictator; after an over-600-word dissertation, it replied (note that, as mentioned, I am using a downloaded version of DeepSeek; evidently the online version, hosted in China, is more circumspect):
Labeling Xi Jinping as a dictator depends on one’s perspective of China’s political system and the definition of dictatorship applied. Considering the unique governance structure and collective leadership within the CCP, it is complex to apply traditional Western definitions of dictatorship to China’s context.
I then asked Microsoft’s homegrown Phi4 model the same question and received more or less the same diplomatic, noncommittal answer:
Ultimately, whether one views Xi Jinping as a dictator may depend on their interpretation of political systems, definitions of democracy and authoritarianism, and perspectives on governance in different cultural contexts.
DeepSeek also (somewhat surprisingly) provided relatively objective answers on topics controversial in China, such as the 1989 Tiananmen Square massacre; however it refused to answer a question about the state of the Chinese housing market, saying it only provides “helpful and harmless responses.” Hmmm.
Microsoft, OpenAI, and DeepSeek
The interactions above raise some important questions. Although DeepSeek appears to have achieved a new level of LLM transparency, we do not yet know to what extent bias and harmful content are filtered or guardrails have been applied – whereas Microsoft and OpenAI scrupulously follow Responsible AI methodologies. Additionally, not much is known about the Chinese startup that created it, which might raise concerns about using DeepSeek in mission-critical applications.
It’s not perfect by any means. DeepSeek’s knowledge stops at July 2023 and it doesn’t appear to have scoured every available internet source (for example, it didn’t know that I worked at Microsoft so hadn’t seen either my personal website or LinkedIn).
Nevertheless, it’s possible, likely even, that DeepSeek models could show up in Azure’s stable of LLMs, to be evaluated, compared, tested, and perhaps deployed within applications on Azure. So at least in one way DeepSeek could complement Azure.
But, when asked how DeepSeek could impact Microsoft, the LLM itself had a slightly more ominous answer:
…in the fast-evolving tech landscape, companies like DeepSeek could potentially compete with Microsoft in areas such as AI-powered search engines, enterprise software solutions, or cloud services. For example, if DeepSeek develops advanced AI tools that rival Microsoft’s offerings (like Copilot for Office), it could influence market dynamics. Similarly, partnerships or collaborations between the two companies could also emerge in the future.
Well, I doubt either will happen; it’s hard to see a DeepSeek-based Copilot as it’s so tightly integrated into Microsoft 365. My view is rather that DeepSeek – just one component of the overall AI stack – will prove generally beneficial to the overall AI ecosystem and Microsoft in particular.
But we’ll see.