Table of Contents
<aside>
An AI faceless video generator SaaS allows users to create professional-quality videos programmatically using AI, based on the Description and theme the user provides.
In this guide, we’ll focus on integrating Editly a versatile open-source video editing library, that allows you to programmatically generate videos by combining different types of media clips (text, audio, video, images) and adding transitions and effects.
</aside>
<aside>
Before we get into the technical details, here are the key features your platform will offer:
Purpose | Tool/Framework Name | Website |
---|---|---|
Web App | NextJS | nextjs.org/ |
Auth & Database | Supabase | supabase.com |
Programmatic Video Creation | Editly | Github Repo |
Advanced Programmatic Video Creation (Alternatives) | Diffusion Studio, Remotion | Diffusion Studio , Remotion |
API/Model Name | Description | Website |
---|---|---|
OpenAI API | To create Script & captions | openai.com |
Flux | AI Image Generation | flux dev |
LTX-Video (Optional) | Image to Video Generation | ltx-video |
Eleven labs | Text to Speech Generation for Voiceovers | ElevenLabs.io |
The backend for Video Creation with Editly involves several important steps:
Here’s a step-by-step breakdown of the backend process:
1. Processing requests from Frontend
The frontend sends a request (via an HTTP POST) to your backend with the necessary information (e.g., topic for the video, which could be used to generate a script). This data is received by the API layer.
2. Generating Content
Once the backend receives the request, it triggers calls to different AI APIs to generate the required assets:
3. Video Composing with Editly
Once all the necessary assets (script, voiceover, images) are generated, the backend uses Editly to combine them into a video.
Input Configuration: The backend sets up the video with the required clips:
Video Composition: The backend invokes Editly to create the video. This involves defining the layout (e.g., duration of each clip, transitions, and effects like fades or zooms) and rendering the final video.
Here’s a basic example of how you define a video composition:
{
width: 900,
height: 1600,
outPath: './newsTitle.mp4',
defaults: {
layer: { fontPath: './assets/Patua_One/PatuaOne-Regular.ttf' },
},
clips: [
{ duration: 10, layers: [
{ type: 'image', path: './assets/91083241_573589476840991_4224678072281051330_n.jpg' },
{ type: 'news-title', text: 'BREAKING NEWS' },
{ type: 'subtitle', text: 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.', backgroundColor: 'rgba(0,0,0,0.5)' }
] },
],
}
Render the Video: The backend then renders the video in a specific format (e.g., MP4) at the desired resolution (e.g., 1080p or 4K).
4. Task Management for Video Processing (Optional)
If video generation is resource-intensive and may take a long time (minutes to hours), consider using a task queue to handle video generation asynchronously:
Tool | Cost | Cost per Video (Approx. for 1 minute Video) |
---|---|---|
**OpenAI API | ||
GPT-4o Model** | $0.01 / 1k Tokens | $0.065 - $0.08 |
2k input tokens | ||
5-6k tokens to generate Structured JSON outputs | ||
Flux | $0.025 / image | $0.25 - $0.3 |
10-12 images per video | ||
LTX-Video (Optional) | $0.026 / video(6 seconds clip) | $0.26 - $0.3 |
10-12 video clips per video | ||
Eleven labs | $0.11 / minute | $0.11 - $0.16 |
Cost may vary depending on your subscription plan | ||
Total Cost | $0.6 - $0.8 per video |
The broader market is crowded, but niches often remain underserved. Focus on a specific audience or use case, such as: