Microsoft’s new AI model VASA-1 can make pictures ‘talk’

Microsoft’s new AI model VASA-1 can make pictures ‘talk’

Microsoft introduces VASA -1, an AI model that can generate a video from photo. With the help of this tool you can give expressions to the picture in the video.

The software giant has unveiled a new artificial intelligence model that can generate videos of talking human faces by combining an audio clip and a still image.

Microsoft spokesperson said, “VASA-1 can generate realistic videos of people talking, at a resolution of 512×512 pixels and up to 40 frames per second. It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors”

What is VASA-1 AI Video Generator?

Visual Affective Skill Audio or VASA-1 is a top end model of Microsoft, which can create the facial expressions of humans in the best possible way. The tool can generate all the kinds of feelings and emotions from photos on people’s faces.

In a nutshell, the tool can create video with different expressions of that person from a simple photograph. It creates videos with the help of face muscles, lips, nose, head tilt, and other factors.

How the new AI model VASA-1 creates videos from photos

As per Microsoft, VASA-1 uses a user provided photo and an audio clip to create a short video. It is claimed to handle artistic photos, singing audios and Non-English speech. Microsoft said, “It can handle arbitrary length audio and stably output seamless talking face videos”.

The AI model is capable of producing precious lip-audio synchronization as well as generating a wide spectrum of expressive facial nuances and natural head motions.

To add realism into the video, the model accepts optional signals as conditions, like eye gaze direction (forward- facing, leftwards, rightwards and upwards), head distance (close-up, extreme close-up) and emotion offsets (neutral, happiness, anger and surprise).

Microsoft claims that this AI tool can create videos like real life. That means with its help you can give expressions like real life. The company has shown VASA-1 as research demonstration. They have made it clear that there are no plans to release this product or release its API.

Microsoft will not release this product as there is high possibility of it been misused. The tool is similar to OpenAI’s Sora. Both the tools generate realistic videos. Sora creates complex videos from backgrounds and artifacts. VASA-1 is on human expression. Both these tools are not yet available in public domain.

Share on:


Leave a Reply

Your email address will not be published. Required fields are marked *