Frequently Asked Questions

CREATE A TALKING AVATAR

What is the Photo2Video AI?

Photo2Video AI is a self-service platform that combines deep-learning face animation technology and Stable Diffusion text-to-image capabilities to enable users to create videos featuring moving and talking avatars. It is designed for businesses and content creators who want to create engaging videos using AI-powered avatars.

Who is the Photo2Video AI for?

The Photo2Video AI targets businesses seeking digital humans for various commercial purposes in sales, marketing, training, and customer success. It is also intended for content creators looking to use AI avatars to create captivating videos.

What video format and resolution do you support?

All videos are generated in MP4 format. Output resolution depends on the AI Presenter used and your Photo2Video AI plan:

Standard AI Presenter output resolution: up to 1280×1280 pixels
Premium AI Presenter output resolution: 720p

What is the output video length?

For Photo2Video AI, the video length is limited to 5 minutes.

What are the image upload size & format requirements?

Image size limits: 4.5 MB for Photo2Video AI. Supported formats: JPEG, JPG, PNG.

Generating Faces

To animate faces, you can select from pre-made avatars, upload a facial image, or use the Stable Diffusion-powered text-to-image portrait generator.

Getting the avatars to talk

You can add voice by typing a script

Which languages does the Photo2Video AI support?

The studio supports 119 languages, along with various accents and speaking styles.

Can I add pauses to make the text more realistic?

How do I get the right result when generating a face?

Experiment with pre-created prompts and try out variations or search for prompts and inspiration on Lexica or other prompt-building platforms.