网易首页 > 网易号 > 正文 申请入驻

Tencent's quiet collaboration with DeepSeek enhances AI model performance

0
分享至

by Lu Keyan

In a move that underscores China’s growing strength in open-source AI development, Tencent has quietly partnered with DeepSeek to enhance the performance of DeepEP, a communication library central to the training of large AI models. The collaboration, which only came to light recently via a GitHub post by a DeepSeek engineer, reflects a rare yet significant cooperation between two of the country’s leading AI players.

According to the engineer, Tencent’s contributions brought about a “huge speedup” in DeepEP’s capabilities, directly benefiting all developers using DeepSeek’s open-source offerings.

Jiemian News spoke exclusively with Tencent’s StarLake Network team, the group behind the infrastructure powering its proprietary Hunyuan model, to learn more about the collaboration.

The technical exchange dates back to February this year, when DeepSeek open-sourced five core codebases aimed at allowing developers to reproduce high-performance training with a fraction of the hardware traditionally required. Among them was DeepEP—a library designed for communication within Mixture-of-Experts (MoE) models, which are used to reduce the cost and computational burden of training and deploying large-scale models such as GPT-4 and DeepSeek itself.

Tencent was an early adopter of the MoE framework in China, having implemented it in Hunyuan by early 2024. Previously, such models relied on Nvidia’s proprietary NCCL communication library, which posed high costs and limited flexibility. DeepEP offered a more accessible alternative, but its performance was uneven—particularly in the RoCE (RDMA over Converged Ethernet) networks commonly used by Chinese tech firms. Designed initially for InfiniBand, DeepEP struggled to maintain speed and efficiency on RoCE, leading to significant communication delays during model training.

These delays had a tangible cost. Dr. Xia Yinben, chief architect of the StarLake Network Lab, explained that inefficient networking forces expensive GPUs to idle while waiting for data transfers, leading to higher operational costs and slower responses for users.

Tencent’s advantage in addressing this issue, Xia said, stemmed from its long-running investments in networking technologies driven by high-demand applications across QQ, WeChat, online gaming, and cloud services. In 2022, the company began developing a dedicated network architecture tailored to AI workloads, known as StarLake.

The team optimized DeepEP’s performance under RoCE by adapting it to Tencent’s in-house TRMT (Tencent Remote Memory Transport) communication library. Drawing on research into the RoCEv2 protocol stack and dual-port network interface cards, they sought to better utilize available bandwidth while reducing latency. TRMT enabled GPUs to bypass the CPU and directly manage RDMA (Remote Direct Memory Access), minimizing control overhead and accelerating data exchange.

Tencent reports that these enhancements led to a 100% performance improvement under RoCEv2 and a 30% gain in InfiniBand environments. In practical terms, said Huang Xiaojie, one of the network architects involved, “a 10% performance gain in training translates to a 10% cost saving. For inference, it also means users wait less—say, from 10 seconds to 9 seconds per query.” While those gains remain internally benchmarked and may vary under different workloads or hardware conditions, they suggest meaningful efficiency improvements in both model training and deployment.

Tencent’s emphasis on RoCE over InfiniBand reflects broader strategic considerations. InfiniBand, favored in high-performance computing for its low latency, is largely dominated by Nvidia and carries higher costs and supply-chain risks. From the outset, Tencent built its AI infrastructure around Ethernet-based RoCE and developed its own communication libraries, first TCCL and more recently TRMT.

Chen Mingzhuo, another architect from the StarLake team, said Tencent and DeepSeek maintained ongoing communication not only around troubleshooting but also on the future evolution of AI networking. Their shared priority is minimizing GPU idle time caused by communication bottlenecks.

Traditionally, data transfer coordination within AI systems has relied on the CPU. Tencent’s approach is to link multiple GPUs more tightly, allowing them to access each other’s memory directly. This architecture reduces the need for CPU mediation and compensates for the lower compute capacity of domestic GPUs—an increasingly common constraint in China’s AI ecosystem.

The optimized version of DeepEP has since been contributed back to the open-source community and deployed in Tencent’s Hunyuan model. Other Chinese tech firms have also expressed interest in the enhancements and provided feedback, signaling a broader impact on the domestic AI infrastructure landscape.

Tencent, in this case, is both a beneficiary and contributor to the DeepSeek ecosystem. During Tencent’s recent earnings call, chairman and CEO Pony Ma expressed admiration for DeepSeek’s openness and efficiency, calling it “a truly open and free product” and noting that Tencent’s cloud services and its AI assistant Yuanbao have both integrated DeepSeek models.

The collaboration also reflects a deeper commitment by Tencent to open-source participation. Beyond cost efficiency or technical convenience, the company sees open-source development as key to building trust and accelerating innovation in an increasingly competitive global AI race.

特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相关推荐
热点推荐
女学霸发明“咯噔字体”,老师低分警告:别用个性挑战考试底线

女学霸发明“咯噔字体”,老师低分警告:别用个性挑战考试底线

蝴蝶花雨话教育
2026-05-07 00:05:04
西方越想越怕,中方所有军演预案,都在练单挑全世界

西方越想越怕,中方所有军演预案,都在练单挑全世界

了舞天下
2026-06-17 17:57:55
豪赌一把!马刺可能收获一名顶级内线大闸,也可能浪费一个首轮签

豪赌一把!马刺可能收获一名顶级内线大闸,也可能浪费一个首轮签

稻谷与小麦
2026-06-24 00:05:13
太离谱! C 罗世界杯梅开二度还被名宿狂喷:他连单刀都把握不住

太离谱! C 罗世界杯梅开二度还被名宿狂喷:他连单刀都把握不住

奶盖熊本熊
2026-06-24 04:11:35
600851,4连板!600909,盘中直线涨停!券商股,突变!

600851,4连板!600909,盘中直线涨停!券商股,突变!

证券时报e公司
2026-06-24 10:34:47
特朗普炮轰参议院:给伊朗“送安慰”,让谈判更艰难

特朗普炮轰参议院:给伊朗“送安慰”,让谈判更艰难

观察者网
2026-06-24 11:44:10
天后夏奇拉带儿子看世界杯,13岁长子因长相成熟被误认为新男友

天后夏奇拉带儿子看世界杯,13岁长子因长相成熟被误认为新男友

洲洲影视娱评
2026-06-23 18:18:53
何超欣携62万香奈儿包亮相赛马会,被嘲廉价似A货

何超欣携62万香奈儿包亮相赛马会,被嘲廉价似A货

晓銊就是我
2026-06-22 02:11:38
“说出去杀你全家”,13岁男孩强奸14岁女孩,当庭释放杀女孩母亲

“说出去杀你全家”,13岁男孩强奸14岁女孩,当庭释放杀女孩母亲

易玄
2026-06-23 13:16:21
李嘉诚曹德旺可能说对了!2026年不买房,5年后会庆幸还是后悔?

李嘉诚曹德旺可能说对了!2026年不买房,5年后会庆幸还是后悔?

老覃讲历史
2026-06-24 00:18:00
谁能想到,专门收割中国人的东南亚电诈头目,竟是侨商同胞!

谁能想到,专门收割中国人的东南亚电诈头目,竟是侨商同胞!

君笙的拂兮
2026-06-23 23:16:15
臭名昭著的“中国行动计划”卷土重来,更低调、更激进了

臭名昭著的“中国行动计划”卷土重来,更低调、更激进了

观察者网
2026-06-24 13:52:15
东南亚人口拐卖最新套路:20万一个人,落地就被卖进园区!

东南亚人口拐卖最新套路:20万一个人,落地就被卖进园区!

命运自认幽默
2026-06-22 01:28:36
向佐向佑兄弟破冰同框,向太刷礼物霸榜,豪门和解多少钱也换不来

向佐向佑兄弟破冰同框,向太刷礼物霸榜,豪门和解多少钱也换不来

新金牌娱乐观察家
2026-06-24 09:25:02
李嘉诚曹德旺可能说对了!2026年不买房,5年后会庆幸还是后悔?

李嘉诚曹德旺可能说对了!2026年不买房,5年后会庆幸还是后悔?

丁丁鲤史纪
2026-06-24 10:40:34
重庆荣昌3公斤冰毒案告破!无人机立大功!

重庆荣昌3公斤冰毒案告破!无人机立大功!

生活魔术专家
2026-06-24 08:27:07
刘国梁离任后国乒变味了,年轻小将接连闹情绪,背后隐情引发热议

刘国梁离任后国乒变味了,年轻小将接连闹情绪,背后隐情引发热议

7号观察室
2026-06-24 10:56:12
你知道的吸毒人最后怎样了?看网友讲述他们的结局 无尽感慨唏嘘

你知道的吸毒人最后怎样了?看网友讲述他们的结局 无尽感慨唏嘘

侃神评故事
2026-06-24 14:16:41
权游7年来最燃史塔克时刻!《龙之家族》冬狼军杀到

权游7年来最燃史塔克时刻!《龙之家族》冬狼军杀到

自愈小日子
2026-06-24 00:11:43
全世界会发现,伊朗战争打完后,世界只剩下一个超级大国了!

全世界会发现,伊朗战争打完后,世界只剩下一个超级大国了!

史行途
2026-06-17 12:00:53
2026-06-24 14:48:49
界面新闻 incentive-icons
界面新闻
只服务于独立思考的人群
1112939文章数 1336306关注度
往期回顾 全部

教育要闻

提醒!明天15:00查询高考分数!附:查分入口+密码找回方式

头条要闻

郑丽文称国民党追求和平但不放弃自我防卫 国台办回应

头条要闻

郑丽文称国民党追求和平但不放弃自我防卫 国台办回应

体育要闻

字母哥,会把凯尔特人拆了吗?

娱乐要闻

向佐向佑兄弟合体直播!母子终于和解

财经要闻

爆料人:如果我错了,赔偿坐牢都接受

科技要闻

豆包专业版上线:定价68-500元每月

汽车要闻

施鹏泽:为什么奥迪E7X强调座舱气味安全?

态度原创

旅游
亲子
本地
手机
公开课

旅游要闻

山为骨,水为血,天地为窖,时光为曲,酿一杯迎驾山河

亲子要闻

爸爸接4岁女儿放学,在路上吓唬女儿,没想到到家就跟妈妈告状!

本地新闻

吃一次广东龙舟饭,才懂什么是豪华盛宴

手机要闻

隐私显示功能逐步铺开 小米荣耀均在测试推进

公开课

李玫瑾:为什么性格比能力更重要?

无障碍浏览 进入关怀版