Google Unveils Video and Audio Generation AI Model 'Veo 3,' Signaling a New Era of Creation

Eunsil Ju Reporter

bb311.eunju@gmail.com | 2025-05-21 10:06:05

California – Google has opened new horizons in Artificial Intelligence (AI) technology, with the official unveiling of its video generation AI model, 'Veo 3,' at its annual developer conference, 'Google I/O 2024,' held on May 14 (local time). Veo 3 is being hailed as revolutionary because, unlike existing AI video generation models that primarily focused on visual elements, it can simultaneously generate audio, encompassing music, special effects, and even dialogue.

Josh Woodward, VP of Google Labs & Gemini, emphasized on the Google I/O stage, "We are entering deeply into a new era of creation," suggesting that the advent of Veo 3 will change the paradigm of content creation.

Veo 3: The Perfect Harmony of Video and Audio

According to Google, Veo 3 significantly enhances the quality of Veo 2, which was released earlier this year to compete with rival OpenAI's similar models. Notably, by being able to generate video and audio "for the first time ever" simultaneously, it enables the creation of realistic video content beyond mere images. For example, street scene videos can naturally incorporate background traffic noise, park videos can have bird sounds, and it can even implement dialogue between characters, supporting immersive storytelling.

Veo 3 is currently offering early access to selected creators through Google Labs' 'VideoFX' and is slated for broader release to more users in the future. Furthermore, relevant functionalities are planned for introduction to YouTube Shorts creators, which is expected to bring significant changes to short-form video content creation. Google has already demonstrated its practical applicability by collaborating with renowned directors to produce short films using Veo 3.

Image Generation AI 'Imagen 4' and Film Editing Tool 'Flow' Also Unveiled

In addition to video generation technology, Google also announced the image generation AI model 'Imagen 4' and the film editing tool 'Flow' at this event, expanding its ecosystem of AI-powered creative tools.

Imagen 4 represents a step forward from its previous version, demonstrating exceptional ability in accurately reproducing text. This has been a long-standing limitation of existing AI image generation. Imagen 4 can generate images in various aspect ratios and resolutions up to 2K, making them suitable for print materials and presentations. Crucially, it significantly reduces spelling and typography errors, facilitating the creation of customized cards, posters, and even comics. Imagen 4 can be accessed through the image generation service 'Imagen Factory,' which will be available to Google AI Premium Plan subscribers.

Flow is a new mechanism described by Google as an "AI tool for film," assisting users in generating film clips, scenes, and even entire stories. It's as if an era has begun where AI can help create films without the need for a professional editor. Flow will be offered primarily to new Google AI Pro and Google AI Ultra plan subscribers in the United States.

Google's latest announcements clearly demonstrate that AI is transforming beyond a simple information processing tool into a creative entity. New AI models, spearheaded by Veo 3, are expected to lower the barrier to entry for video content creation and offer users infinite creative possibilities. All eyes are now on how AI technology will further penetrate the media and art fields to drive innovation.

Google Unveils Video and Audio Generation AI Model 'Veo 3,' Signaling a New Era of Creation

WEEKLY HOT