Different ways provide video and audio inputs to DeepWord


To create a synthetic video using DeepWord, we need a video of a person talking, and the audio we want them to say. For both the video and audio, we provide many options to choose from. Let's log into our accounts and take a look at the different input options

import os
from dword.core import DeepWord
from dword.utils import play_audio
from nbdev import show_doc
import random
acc = DeepWord(API_KEY, SECRET_KEY)
login successful

Video inputs

Local video

If you have a video on your computer, you can directly use it to generate a video using DeepWord. Just provide a path to the video and you are set.

acc.generate_video('local_video.mp4', 'sample_audio.mp3', title = 'local_output')

Video actors

If you don't have a video of a person talking with a clear background and limited body movements, you can use one of our video actors. Start by downloading our video actors

'Successfully downloaded all video actors'
!ls video_actors
Anna.mp4    Dalton.mp4  Isaac.mp4   Karen.mp4   Mia.mp4     Richard.mp4
Berto.mp4   Emily.mp4   James.mp4   Marcus.mp4  Micheal.mp4 Sam.mp4
Carlos.mp4  Henry.mp4   Julia.mp4   Mary.mp4    Noelle.mp4  Trey.mp4

And then using them to generate videos.

acc.generate_video('video_actors/Julia.mp4', 'sample_audio.mp3', title = 'actor_output')

Audio inputs

Just like video, we have multiple options available for audio inputs to DeepWord. But before we dive into the audio options, you can also pass two video files to DeepWord.

acc.generate_video('sample_video.mp4', 'another_video.mp4', title = 'two_vids_output')

Local audio

If you have an audio on your computer, you can directly use it to generate a video using DeepWord. Just provide a path to the audio and you are set.

acc.generate_video('sample_video.mp4', 'local_audio.mp3', title = 'local_output')


One of the best features of DeepWord is the text2speech functionality. We support 41 languages and a variety of speakers for each language. Apart from the text you want the speaker to say, the text2speech function requires two more inputs, language and speaker.


DeepWord.text2speech(text:str, language:str, speaker:str, outfile='text2speech.mp3')

To see the available languages you can run


Each language has some speakers associated with it. To see the available speakers for a particular language run


Once you have your language and speaker, you can use text2speech as follows

text = 'Creating synthetic videos has never been easier thanks to DeepWord\'s powerful api'
language = 'english_us'

available_speakers = acc._available_speakers(language)
speaker = available_speakers[3]
'en-US-AriaNeural Female'
acc.text2speech(text, language, speaker)
Successfully generated audio file text2speech.mp3