Minsheng Securities: The release of the bean bag open source video model is expected to activate the "visual market" and open up growth space

DATE: Feb 11 2025

Minsheng Securities released a research report saying that the release of the Doubao open source video model "VideoWorld" makes video generation a general knowledge learning method, acting as an "artificial brain" in the real world, learning knowledge only by "vision", "predicting" the future, and "understanding" cause and effect, which is expected to activate the "visual market" to open up growth space for it, and benefit under the ability of the video model.

Event: On February 10, according to the official announcement of the Doubao large model team, the video generation experimental model "VideoWorld" was released, which is different from mainstream multimodal models such as Sora, DALL-E, and Midjourney, and VideoWorld is the first in the industry to realize that it can recognize the world without relying on language models.

The main views of Minsheng Securities are as follows:

The latest achievements of open-source video generation models, which can only rely on visual perception of the world

Video generation becomes a universal method of knowledge learning, acting as an "artificial brain" in the real world. As a general video generation experimental model, VideoWorld removes the language model and realizes the unified execution of comprehension and reasoning tasks. At the same time, based on a latent dynamic model, it can efficiently compress the change information between video frames, and significantly improve the efficiency and effect of knowledge learning. At present, the code and model of the project have been open-sourced.

Without relying on any reinforcement learning search or reward function mechanism, VideoWorld has reached the level of professional 5-dan 9x9 Go and is able to perform robot tasks in a variety of environments. The team believes that video generation can become a universal method of knowledge learning and act as an "artificial brain" for thinking and acting in the real world.

Models learn knowledge, "predict" the future, and "understand" cause and effect relationships based on "sight" alone

The research team constructed two experimental environments, video Go battle and video robot simulation control, which retained rich visual information while compressing visual changes related to key decisions and actions, enabling more effective video learning, and this pure vision model can "predict" the future and "understand" cause and effect. In the future, the Doubao team will focus on solving its application in real-world environments, but still face challenges such as high-quality video generation and multi-environment generalization.

The ability to visually perceive the world is expected to activate the "visual market" and open up room for growth

As a world-renowned leading enterprise in the video surveillance industry, according to the 2024 global security top 50 list, Hikvision ranks first with 9.722 billion US dollars in 2023 security product sales revenue, exceeding the sum of the bottom two. In 2022, Hikvision clarified its Intelligent Internet of Things (AIOT) strategy, and in 2023, the company officially launched the "Guanlan Large Model" to help various industries achieve digital and intelligent upgrades. According to the Omdia report, Hikvision's share of the global video surveillance market will reach 25.9% in 2022, significantly ahead of the second place. Under the huge video surveillance deployment network, the birth of the open source video model is undoubtedly a shot in the arm for Hikvision.

EZVIZ Network's smart home camera business revenue will account for 62.07% in 2023, and as the company's cash cow business and the first growth curve, its market share is at the forefront of the market, and it has ranked first in the brand rankings of Tmall, Douyin and other platform-related categories in shopping festivals such as Double 11 and 618 for many years. In addition, the company has an ecological closed loop with a high degree of integration of hardware, software, and cloud platforms in terms of vision technology, and device-cloud collaboration, which provides strong support for functions such as intelligent detection, intelligent identification, and AI analysis and reasoning. The release of the visual model is expected to further boost EZVIZ's network vision business to a higher level.

It is recommended to pay attention to:

Hikvision (002415.SZ), EZVIZ Network (688475.SH), Dahua (002236.SZ), TransInfo Technology (002373.SZ), Winner Technology (300609.SZ), Neta Software (603189.SH), Meishi Technology (001229.SZ), etc.

Risk warning: the implementation of technology is less than expected, and the competition in the industry is intensifying.

Follow Yicai Global on

star50stocks

Ticker Name

Percentage Change

Inclusion Date

star50

star50stocks

Log in to Yicai Global

EMAIL

PASSWORD

Create your account

EMAIL

We sent you a code

VERIFICATION CODE

You'll need a password

PASSWORD

Find your Yicai Global account

Enter your email

Check your email

Enter code

Change your password

Enter your new password

Enter your new password again

Reset your password

Enter your new password

star50

star50stocks

Log in to Yicai Global

EMAIL

PASSWORD

Create your account

EMAIL

We sent you a code

VERIFICATION CODE

You'll need a password

PASSWORD

Find your Yicai Global account

Enter your email

Check your email

Enter code

Change your password

Enter your new password

Enter your new password again

Reset your password

Enter your new password

getcode