网易首页 > 网易号 > 正文 申请入驻

China’s Moonshot AI Unveils Kimi K2 Thinking to Take on GPT-5 and Gemini

0
分享至

When Moonshot AI rolled out its newest large-language model, Kimi K2 Thinking, it wasn’t just another product announcement—it was a declaration of intent.

For China’s fast-rising AI champion, the launch marks a dramatic re-entry into the global race for artificial intelligence dominance. The company describes its model as a milestone in “reasoning intelligence,” capable of chaining hundreds of logical steps and tool calls with minimal human supervision.

To enthusiasts in China’s tech circles, the debut felt cinematic. As one social-media commentator put it, “The treasure island of Monte Cristo has reappeared—the prisoner has returned, this time with a plan that shocks the world.”

Moonshot AI’s comeback comes just weeks ahead of a crowded lineup of heavyweight releases—Google’s Gemini 3, OpenAI’s expected GPT-5.1, and DeepSeek’s new generation of open-source models. Yet it is Moonshot AI that has grabbed global headlines first.

A Benchmark Moment for China’s AI Ambitions

The new model has quickly become one of the most talked-about developments in the AI community. Thomas Wolf, co-founder of open-source platform Hugging Face, summed up the sentiment on X: “Is this another ‘DeepSeek moment,’ where open source once again outpaces closed source?”

When DeepSeek’s open-source R1 model briefly surpassed OpenAI’s o1 in reasoning benchmarks earlier this year, it marked a symbolic victory for open development. Moonshot AI is now aiming higher, positioning Kimi K2 Thinking directly against closed-source leaders like GPT-5 and Claude 4.5 Sonnet from Anthropic.

While analysts acknowledge that K2 Thinking still has rough edges, few dispute its importance. For a company that some doubted could keep pace after DeepSeek’s surge, the new release restores Moonshot AI’s standing among the world’s top model developers.

“Kimi K1.5 was exploration. K2 showed technical maturity. K2 Thinking cements confidence—inside and outside the company,” one industry investor told CNBC. “It proves Moonshot AI still belongs in the first echelon.”

Much of the early buzz has centered on cost. Rumors circulated that training K2 Thinking required only $4.6 million—a fraction of the hundreds of millions reportedly spent by U.S. rivals.

In an online AMA on Reddit on November 11, Moonshot AI’s founder Yang Zhilin, joined by partners Zhou Xinyu and Wu Yuxin, addressed the speculation head-on.

“That number isn’t official,” Yang said. “Training cost can’t be captured by a single figure—it includes exploration, failed experiments, and endless iteration.”

The team explained that what mattered wasn’t dollars spent, but how efficiently every GPU was pushed. Moonshot uses Infiniband-connected H800 GPUs, hardware that lags the top U.S. systems but, as engineers put it, “was driven to its limits.”

K2 Thinking’s most unconventional choice may be its optimizer. Instead of relying on established algorithms, Moonshot adopted Muon, a largely untested optimizer. The decision raised eyebrows, but the team insists it followed rigorous scaling-law validation and small-scale testing before full deployment.

“Before Muon, we eliminated dozens of other optimizers,” said Zhou. “By the time we scaled up, we knew the risk profile intimately.”

On data strategy, Moonshot offered a rare look into its training philosophy. “Finding the right dataset is an art,” one engineer said during the AMA. “Different data sources interact in complex ways—intuition matters, but evidence decides.”

The company declined to disclose dataset details but emphasized that each architectural change underwent strict ablation testing before scaling. “If the model shows any instability, scaling stops immediately,” Wu noted.

K2 Thinking currently supports text-based interaction only, a deliberate decision. Video and multimodal models demand vastly higher data preparation and training resources, the team said. A million-token context window has already been tested but is temporarily withheld because of cost. “It’ll likely return in future releases,” Yang added.

Many early users have praised Kimi K2 Thinking for its natural prose style—balanced, coherent, and sometimes poetic. According to the company, this reflects a mix of strong pre-training foundations and targeted fine-tuning during reinforcement learning.

“The tone and rhythm of a model reflect the taste of the team behind it,” Yang said.

Still, some testers have complained the model feels overly cautious or “too positive” in combative dialogues. The team concedes the point. “It’s a persistent challenge to reduce unnecessary filtering while maintaining safety,” Zhou said. The company is even open to revisiting policies on mature content if robust age-verification systems are implemented.

Where K2 Thinking truly stands out is in reasoning depth. It can complete 200 to 300 sequential tool calls in a single chain, sustaining coherent logic throughout. That’s a major step toward practical “agentic reasoning,” where models plan, act, and adjust autonomously.

Moonshot credits an end-to-end agent reinforcement learning approach combined with INT4 inference, which accelerates long reasoning sequences without degrading accuracy.

This capability puts K2 Thinking squarely in competition with models like Anthropic’s Claude, known for long-term planning and adaptive problem solving. “We’ve lowered the entry barrier for deep reasoning,” Yang said.

The company also revealed research on a new architecture called KDA (Kernel Attention Dual Architecture)—slated for the next-generation K3 model. KDA is designed to balance massive context windows with faster throughput, signaling Moonshot’s continued focus on efficiency rather than raw parameter scale.

A Trillion-Parameter Powerhouse

According to Moonshot’s technical documentation, Kimi K2 Thinking is its most powerful open-source reasoning model to date, featuring 1 trillion parameters and a 384-expert Mixture-of-Experts (MoE) structure.

It has achieved industry-leading scores on multiple reasoning benchmarks: 44.9% on Humanity’s Last Exam with tools, 60.2% on BrowseComp, and 71.3% on SWE-Bench Verified. Those figures place it in the same competitive band as the newest Western models.

More impressively, the system sustains hundreds of reasoning steps without manual correction. In one demonstration, it solved a PhD-level mathematics problem through 23 rounds of reasoning and tool use, showcasing multi-stage planning and self-correction rarely seen outside research labs.

K2 Thinking also excels in coding tasks, particularly in front-end development using HTML and React. It can translate ideas into working interfaces, automatically debugging and adjusting in real time. The model performs well in agent-based coding environments, where it collaborates with other software agents to handle complex, multi-phase workflows.

Large reasoning models typically struggle with latency and memory overhead. Moonshot tackled the issue with Quantization-Aware Training (QAT) during post-training, applying INT4 weight-only quantization to the MoE components.

The result: near-native accuracy with roughly double the generation speed and lower GPU usage—crucial for commercial scalability.

“Reasoning-oriented models have long decoding lengths, which makes quantization tricky,” explained Wu. “But with QAT we preserve quality while cutting cost. That’s the kind of engineering efficiency this era demands.”

For years, the AI arms race was defined by model size—more parameters, more power. Moonshot AI’s latest release suggests that the frontier has shifted. The new competition centers on inference efficiency, reasoning coherence, and usability.

Analysts say the approach echoes a broader trend across the industry: focusing less on raw scale and more on intelligent design. “The big players are learning that trillion-parameter bragging rights mean little if latency kills adoption,” said a Beijing-based AI investor.

Moonshot’s challenge is clear. Maintaining momentum will require proving that K2 Thinking can match Western models not only in benchmark tests but also in enterprise adoption. Companies across finance, manufacturing, and education are already experimenting with agent-style AI systems that automate planning and analysis.

The competition is fierce. OpenAI’s upcoming GPT-5.1 is rumored to integrate advanced multimodal reasoning, while Google’s Gemini 3 aims for tighter integration with search and workspace tools. DeepSeek, the open-source rival that shook the market earlier this year, is also preparing its next upgrade.

“In this new phase, it’s not just about who trains the biggest model,” said an industry analyst. “It’s about who can balance depth of technology, engineering efficiency, and ecosystem strategy.”

Moonshot AI appears keenly aware of that equation. Its mix of pragmatic engineering and bold experimentation has made it one of the few Chinese firms still considered contenders on the global stage.

Kimi K2 Thinking may not instantly dethrone GPT-5 or Claude, but it demonstrates that the world’s most ambitious AI work is no longer confined to Silicon Valley.

Moonshot’s engineers say the next generation, K3, will feature the new KDA architecture and possibly multimodal capabilities. They’re also considering selective open sourcing—particularly in alignment and safety components—to foster community research while preventing misuse.

For now, K2 Thinking stands as both a technological statement and a philosophical one: that in the evolving AI era, innovation is less about sheer power and more about how intelligently that power is managed.

As Yang put it at the close of the AMA: “AI isn’t just about thinking faster—it’s about thinking better. With Kimi K2 Thinking, we want to prove that better thinking can come from anywhere.”

特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相关推荐
热点推荐
风雪送棋圣 弈魂世长存

风雪送棋圣 弈魂世长存

新民晚报
2026-01-18 11:42:47
我去医院捐精,漂亮女护士竟然红着脸问:我能做你女朋友吗?

我去医院捐精,漂亮女护士竟然红着脸问:我能做你女朋友吗?

千秋历史
2026-01-18 19:03:49
澳网历史首次!三盘比赛全部打到抢七,20号种子遭逆转爆冷出局

澳网历史首次!三盘比赛全部打到抢七,20号种子遭逆转爆冷出局

全景体育V
2026-01-18 15:10:25
腿长漂亮的小姐姐

腿长漂亮的小姐姐

手工制作阿歼
2026-01-19 06:30:43
苹果官宣:iPhone 以旧换新重大调整,历史罕见!

苹果官宣:iPhone 以旧换新重大调整,历史罕见!

辉哥说动漫
2026-01-19 01:27:37
CBA最新消息!广东宏远更换外援,张帆或离开北控男篮

CBA最新消息!广东宏远更换外援,张帆或离开北控男篮

体坛瞎白话
2026-01-18 08:14:56
功夫影星梁小龙去世,享年77岁;周星驰曾3次邀请梁小龙演“火云邪神”

功夫影星梁小龙去世,享年77岁;周星驰曾3次邀请梁小龙演“火云邪神”

扬子晚报
2026-01-18 17:40:54
向华强自曝心里只有向太:那些娱乐圈明星美女我根本看不上眼

向华强自曝心里只有向太:那些娱乐圈明星美女我根本看不上眼

韩小娱
2026-01-18 09:11:50
18岁全红婵正式回归!身高近170cm留长发,与师姐贴肩合影超暖

18岁全红婵正式回归!身高近170cm留长发,与师姐贴肩合影超暖

体育见习官
2025-12-30 09:23:52
太厉害了!快去用中医诊断航空发动机吧

太厉害了!快去用中医诊断航空发动机吧

走读新生
2026-01-12 11:00:04
又伤了?记者:伦纳德左膝不适,将提前返回洛杉矶治疗

又伤了?记者:伦纳德左膝不适,将提前返回洛杉矶治疗

懂球帝
2026-01-19 00:55:41
安徽小伙在小国家创业,“享受”一夫多妻、开放生活的他如今怎样

安徽小伙在小国家创业,“享受”一夫多妻、开放生活的他如今怎样

牛牛叨史
2026-01-06 12:59:43
聂卫平葬礼现场:日籍长子站首位,小23岁娇妻哭红眼,大人物来了

聂卫平葬礼现场:日籍长子站首位,小23岁娇妻哭红眼,大人物来了

乡野小珥
2026-01-18 16:24:39
包头包钢工厂爆炸后续:疑似员工曝光群聊,知情人披露几点细节

包头包钢工厂爆炸后续:疑似员工曝光群聊,知情人披露几点细节

小鹿姐姐情感说
2026-01-19 04:49:27
包钢股份一板材厂爆炸事故已致2人死亡5人失联66人送医

包钢股份一板材厂爆炸事故已致2人死亡5人失联66人送医

界面新闻
2026-01-18 19:31:13
新版张继科!19岁中国小将3-2打懵世界第3 国乒包揽男单冠亚军

新版张继科!19岁中国小将3-2打懵世界第3 国乒包揽男单冠亚军

风过乡
2026-01-18 19:53:09
申军:李昊是邵佳一推荐来当第二门将的,他的心理素质很好

申军:李昊是邵佳一推荐来当第二门将的,他的心理素质很好

懂球帝
2026-01-18 18:00:09
电网设备概念,最值得关注的8家核心公司(硬核梳理)

电网设备概念,最值得关注的8家核心公司(硬核梳理)

坠入二次元的海洋
2026-01-18 11:58:47
陈晓巢湖拍杂志被偶遇,人群中帅的显眼,就是人太瘦了穿的也拉胯

陈晓巢湖拍杂志被偶遇,人群中帅的显眼,就是人太瘦了穿的也拉胯

草莓解说体育
2026-01-18 12:53:03
樊振东单局16-14,模仿C罗动作!乒乓球欧冠:萨尔布吕肯3-0获胜

樊振东单局16-14,模仿C罗动作!乒乓球欧冠:萨尔布吕肯3-0获胜

齐帅
2026-01-19 04:41:05
2026-01-19 07:23:00
钛媒体APP incentive-icons
钛媒体APP
独立财经科技媒体
129093文章数 861660关注度
往期回顾 全部

教育要闻

期末实用评语指南!给每个孩子一份被看见的成长

头条要闻

特朗普建"联合国"自任主席 邀60国加入仅1国接受

头条要闻

特朗普建"联合国"自任主席 邀60国加入仅1国接受

体育要闻

21年后,中国男足重返亚洲四强

娱乐要闻

香港武打演员梁小龙去世:享年77

财经要闻

BBA,势败如山倒

科技要闻

AI大事!马斯克:索赔9300亿元

汽车要闻

又一次闷声干大事,奇瑞进入2.0 AI+时代

态度原创

房产
游戏
艺术
本地
时尚

房产要闻

真四代来了!这次,海口楼市将彻底颠覆!

《GTA6》两大主角开场任务泄露 60%的建筑可进去

艺术要闻

14位欧美画家的15幅女性作品

本地新闻

云游内蒙|黄沙与碧波撞色,乌海天生会“混搭”

美拉德过时了?今年冬天最火的4个颜色竟然是它们

无障碍浏览 进入关怀版