网易首页 > 网易号 > 正文 申请入驻

ShengShu Launches New Generative AI Video Model to Rival OpenAI's Sora and G...

0
分享至


AI-generated image

TMTPOST -- ShengShu Technology, one of China’s fastest-growing multimodal generative artificial intelligence startups, has unveiled a new version of its AI video generation model aimed squarely at challenging OpenAI’s Sora 2 and Google’s Veo 3.1, two of the world’s most advanced text-to-video systems.

The Beijing-based firm said on Tuesday that its new release, Vidu Q2, significantly improves consistency, narrative control, and creative flexibility, marking a step forward in the company’s ambition to compete globally in the emerging field of AI-driven video creation.

According to ShengShu, Vidu Q2 allows creators to upload and merge up to seven reference images—covering faces, scenes, or props—into a single coherent video. The model’s new “multi-entity consistency” feature blends these visual elements with text prompts while maintaining the unique characteristics of each reference, reducing the distortions and blending errors that often appear in existing models.

“Vidu Q2 marks a new chapter in AI video creation,” said Luo Yihang, ShengShu’s chief executive officer, during the product announcement. “We’re entering an era where AI doesn’t just create videos but acts, reacts, and tells stories alongside human creators. This launch goes beyond simple generation—it’s about teaching AI to perform and express emotion.”

Luo said the company’s goal is not to replace human creativity but to expand it. “With each release, we bring technology and imagination closer together,” he said. “Our aim is to make creativity more accessible—turning imagination into visible, emotional storytelling.”

The Vidu Q2 model introduces several new features that position it directly against Western rivals. Like Google’s Veo 3.1, Vidu Q2 supports transition animations that allow users to upload only the first and last frames of a scene, letting the model generate the in-between motion. This offers creators enhanced control over narrative flow and pacing—a capability particularly valued in film and advertising production.

The company also released a Vidu Q2 application programming interface (API), allowing enterprises and studios to integrate the model into their workflows for automated or customized content generation.

ShengShu emphasized that its new system delivers comparable visual quality to Sora 2 and Veo 3.1 at a faster speed and lower cost, potentially making high-quality generative video creation more accessible to independent creators and small businesses.

Industry insiders told Yicai Global that the pricing advantage could prove decisive. While U.S.-based models require extensive cloud resources and expensive compute credits, ShengShu’s localized infrastructure and optimized compression algorithms make Vidu Q2 considerably cheaper to operate.

In one scenario, Vidu Q2 was prompted to generate a video depicting a blade battery module moving on a conveyor belt inside a Chinese electric vehicle factory, being scanned by a Siasun yellow industrial robot, with a digital screen showing “99.92” in simplified Chinese characters.

The system successfully fused all visual elements—the battery, robotic arm, Siasun logo, and Chinese text—into a smooth, stable sequence. Observers said the video maintained high visual fidelity, especially in rendering Chinese characters accurately, demonstrating the strength of the multi-entity consistency feature.

In comparison, Google’s Veo 3.1, which supports up to three reference images, failed to reproduce the Chinese text correctly. OpenAI’s Sora 2 handled the text accurately but mistakenly changed the Siasun logo to that of Nissan Motor, showing the difficulty of managing multiple distinct references across frames.

A second test involved a short dialogue scene: a Chinese chairman angrily asking, “The battery caught fire, are you messing with me?” followed by an American CEO replying in English, “Not me, it’s them,” in a Shanghai boardroom setting.

Vidu Q2 generated the scene using reference images for the characters’ expressions. The video demonstrated accurate lip synchronization in both languages and convincing facial animation for anger and frustration. However, the emotional tone of the accompanying audio was relatively flat, lagging behind the natural expressiveness achieved by Veo 3.1.

Despite that, analysts said the results highlight ShengShu’s progress in cross-lingual emotional modeling and multimodal consistency—areas considered technically challenging even for global leaders.

Founded in March 2023 by researchers from Tsinghua University’s Institute for AI Industry Research, ShengShu has quickly risen to prominence in China’s fast-evolving generative AI industry. The startup launched Vidu 1.0 in April 2024 and has since accumulated 30 million users across more than 200 countries and regions, generating over 400 million videos to date.

Vidu’s early versions could produce five- to eight-second clips at 1080p resolution from text or image prompts in either Chinese or English. The Q2 update builds on that base with improved realism, narrative capability, and expanded creative control.

Analysts say the company’s trajectory mirrors China’s broader push to narrow the technological gap with U.S. AI developers. “China’s AI ecosystem is catching up fast,” said an industry expert at a Beijing venture capital firm. “ShengShu’s focus on multimodal integration—especially with localized features like Chinese text and cultural nuances—gives it an edge in domestic and Asian markets.”

Generative video has become one of the most competitive frontiers in AI development. Since OpenAI’s Sora first stunned the industry with its photorealistic videos in early 2024, companies worldwide have raced to build their own models capable of producing complex, cinematic sequences directly from text prompts.

Google’s Veo 3.1 and Anthropic’s experimental systems have set the bar high for quality and consistency, but Chinese startups such as ShengShu, Kuaishou’s Kolors, and Tencent’s Hunyuan Video are rapidly improving.

“The next phase of competition is not just about realism,” said Luo. “It’s about emotional intelligence—how well AI can understand and express human feelings through visual storytelling.”

With Vidu Q2, ShengShu aims to establish itself as a major global player in AI video, blending scientific precision with artistic expression. Luo summed it up: “We want to make imagination visible. This is where technology and emotion finally meet.”

特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相关推荐
热点推荐
V3星舰首发者瞄准2026年1月

V3星舰首发者瞄准2026年1月

三体引力波
2025-11-08 18:22:22
卫健委通报医院回应,当事人报警,涉事副院长与女医生将被问责

卫健委通报医院回应,当事人报警,涉事副院长与女医生将被问责

现代小青青慕慕
2025-11-08 05:57:56
两少年骑车失联21天后续,遗体已找到,曾有活着机会,细节曝光

两少年骑车失联21天后续,遗体已找到,曾有活着机会,细节曝光

鋭娱之乐
2025-11-09 08:27:15
2艘航母,约20艘驱逐舰,30多艘护卫舰!南海舰队已成亚洲最强!

2艘航母,约20艘驱逐舰,30多艘护卫舰!南海舰队已成亚洲最强!

星辰故事屋
2025-11-09 09:57:56
何猷君一家出席晚宴,奚梦瑶穿搭惊艳全场,儿子彰显贵族风范!

何猷君一家出席晚宴,奚梦瑶穿搭惊艳全场,儿子彰显贵族风范!

可乐谈情感
2025-11-09 07:51:45
美防长:美国将在必要时凭借本国现有资源投入战争,并获得胜利

美防长:美国将在必要时凭借本国现有资源投入战争,并获得胜利

止戈军是我
2025-11-08 13:04:57
公园人工湖清淤,湖底捞出12辆共享单车,每辆车上都绑一具人形模特

公园人工湖清淤,湖底捞出12辆共享单车,每辆车上都绑一具人形模特

悬案解密档案
2025-10-29 13:36:28
重大事件!乌克兰要变天了!

重大事件!乌克兰要变天了!

南权先生
2025-11-08 16:05:27
马刺绞赢火箭:泥淖战;约基奇的三双?不要了

马刺绞赢火箭:泥淖战;约基奇的三双?不要了

张佳玮写字的地方
2025-11-08 14:11:14
《繁花》剧组提到古二长期滞留境外代表了什么?信息量有些大

《繁花》剧组提到古二长期滞留境外代表了什么?信息量有些大

可乐谈情感
2025-11-09 00:20:27
西贝闭店潮汹涌,贾国龙天要塌了

西贝闭店潮汹涌,贾国龙天要塌了

财经三分钟pro
2025-11-08 11:49:36
人死户销后,社保局绝不会主动说的“冷知识” 这几笔钱得主动领

人死户销后,社保局绝不会主动说的“冷知识” 这几笔钱得主动领

历史求知所
2025-11-07 15:55:07
国内自驾游最经典的27条路线,适合退休人士,争取一年走2条!

国内自驾游最经典的27条路线,适合退休人士,争取一年走2条!

走吧自驾游
2025-11-05 18:04:38
投资遭“强行接管”:民企的县城之殇

投资遭“强行接管”:民企的县城之殇

三人成虎V5
2025-10-31 11:51:58
细绳当裤子穿引争议:穿衣自由与公序良俗的边界究竟在哪?

细绳当裤子穿引争议:穿衣自由与公序良俗的边界究竟在哪?

诗意世界
2025-11-08 21:44:13
下手太快了!中国5大友国高层被请进白宫,特朗普:正酝酿B计划

下手太快了!中国5大友国高层被请进白宫,特朗普:正酝酿B计划

现代小青青慕慕
2025-11-08 11:11:24
激怒山姆用户的,不止AI商品图

激怒山姆用户的,不止AI商品图

刺猬公社
2025-11-07 21:37:19
五角大楼对华态度大变,特朗普猛然意识到:最大的敌人已经出现了

五角大楼对华态度大变,特朗普猛然意识到:最大的敌人已经出现了

小影的娱乐
2025-11-09 09:29:50
希曼:最后拉亚的出击方式有问题,他应该将球击出而不是去抱

希曼:最后拉亚的出击方式有问题,他应该将球击出而不是去抱

懂球帝
2025-11-09 09:09:10
高中时期你经历过哪些炸裂事迹?网友:大家的青春都这么污的吗

高中时期你经历过哪些炸裂事迹?网友:大家的青春都这么污的吗

带你感受人间冷暖
2025-10-03 00:20:08
2025-11-09 12:00:49
钛媒体APP incentive-icons
钛媒体APP
独立财经科技媒体
126002文章数 861363关注度
往期回顾 全部

教育要闻

一眼心动!京城小画师绘出多彩校园生活

头条要闻

浙江男子在家门口发现"人参"直接生吃半根 结果悲剧了

头条要闻

浙江男子在家门口发现"人参"直接生吃半根 结果悲剧了

体育要闻

马刺绞赢火箭,不靠文班亚马?

娱乐要闻

《繁花》剧组又回应了?

财经要闻

10月CPI同比上涨0.2% CPI同比下降2.1%

科技要闻

黄仁勋亲赴台积电“讨要更多芯片”

汽车要闻

钛7月销破2万 霜雾灰与青峦翠配色正式开启交付

态度原创

本地
亲子
时尚
教育
公开课

本地新闻

这届干饭人,已经把博物馆吃成了食堂

亲子要闻

孩子脾胃弱不用愁,四神汤搭对食材,养胃补营养元气足

五十多岁的女性秋季别瞎打扮,这3个技巧实用还时髦,快收藏

教育要闻

张宁娟:深度解读2026年高校特殊类型招生

公开课

李玫瑾:为什么性格比能力更重要?

无障碍浏览 进入关怀版