Faceless Video Factory — 200+ Videos/Month with Remotion + AI

The Solution — Leon Microservice

I built a standalone Remotion-based video rendering microservice with a clean API surface: n8n (running on a separate Oracle Cloud VM) generates scripts with DeepSeek, audio with Google Cloud TTS, and images with GPT Image Mini — then POSTs a JSON payload to the microservice, which renders the final video and returns it via download endpoint. Five production templates cover 16:9 long-form, 9:16 shorts, and static thumbnails. The entire service runs on a single ARM64 Docker container on Oracle Cloud's free tier, with sub-30 second cold starts and no serverless cost spike.

How It Works

📝

n8n Generates Assets

DeepSeek writes the script, Google Cloud TTS synthesizes voiceover, GPT Image Mini generates on-brand images. All assets uploaded to CDN.

📥

POST /render

n8n sends a single JSON payload with composition name, props (title, text, images, audio URL), and branding config. Returns a job ID.

🎬

Remotion Orchestration

The microservice loads the matching template (ArticleYT, DreamShort, HoroscopeWeekly, etc.), syncs scene durations to the audio, and orchestrates Ken Burns zooms, subtitles, watermarks, and music fade layers.

💻

Render Pipeline

Remotion renders the composition frame-by-frame to an MP4 with h264 + AAC encoding. Progress is tracked in an in-memory job store for polling.

↓

GET /download/:jobId

Once complete, the MP4 is exposed via a signed download endpoint. n8n pulls the file and uploads it to YouTube / TikTok / Instagram via their respective APIs.

🎯

Multi-Language Reuse

Images are generated once per video, then reused across the 3 language versions — only audio and text change. This cuts image generation costs by 66%.

The 5 Production Templates

Template	Format	Resolution	Duration	Use Case
ArticleYT	16:9	1920×1080	8–15 min	YouTube long-form from blog articles
DreamShort	9:16	1080×1920	2–3 min	TikTok / Reels / Shorts — dream interpretation
HoroscopeWeekly	9:16	1080×1920	~5 min	TikTok / Reels / Shorts — weekly horoscope per sign
DreamThumbnail	16:9	1280×720	1 frame	Static thumbnail for DreamShort videos
HoroscopeThumbnail	16:9	1280×720	1 frame	Static thumbnail for HoroscopeWeekly videos

Every template shares the same underlying architecture: a set of reusable components (KenBurnsImage, AudioLayer, GlowText, AnimatedSubtitle, Watermark, FloatingParticles) composed into scene sequences. Adding a new template takes hours, not days — I register it in Root.tsx, define its Zod schema in types/common.ts, and compose scenes from the existing library.

Tech Stack

Video Engine

FrameworkRemotion v4

LanguageTypeScript + React

API ServerNode.js + Express

ValidationZod schemas per template

Encodingh264 + AAC via ffmpeg

Assets Pipeline

ScriptsDeepSeek via n8n

VoiceoverGoogle Cloud TTS WaveNet

ImagesGPT Image Mini (1 / scene)

MusicInternet Archive CC

StorageDocker volume CDN

Infrastructure

ContainerDocker ARM64

HostOracle Cloud Free Tier

Orchestrationdocker-compose

Reverse Proxynginx

Cold Start< 30 seconds

Microservice Architecture

Client · runs on separate VM

GAEL — n8n Orchestrator

DeepSeek (script)

Google TTS (voice)

GPT Image (art)

CDN upload

POST /render  →  JSON payload

Leon · this project

Remotion Video Microservice

Node.js + Express + Remotion v4 · Docker ARM64 · Oracle Cloud

📝

Validate

Zod schema per template

📋

Queue

In-memory job store

🎬

Render

Remotion + ffmpeg

💾

Serve

/download/:jobId

GET /status/:jobId  ·  GET /download/:jobId

Downstream · back on n8n

MP4 → Multi-Platform Publish

YouTube

TikTok

Instagram

API Endpoints

Endpoint	Method	Purpose
`/health`	GET	Health check for orchestration and uptime monitoring
`/compositions`	GET	Lists all available templates with their input schemas
`/render`	POST	Async render — returns jobId, validates props against Zod schema
`/render-sync`	POST	Synchronous render — blocks until video is ready (for short pipelines)
`/status/:jobId`	GET	Job status and progress percentage for polling clients
`/download/:jobId`	GET	Streams the rendered MP4 file back to the caller

Inside a Render Job

Step	Component	What it does
1	Zod schema validation	Rejects malformed payloads before burning render time
2	`calculateMetadata()`	Measures audio length and sets composition duration dynamically — scene durations distribute proportionally
3	Asset resolution	Downloads images, audio, and music from remote URLs or CDN volume
4	Remotion bundling	Compiles the React composition tree into a video-ready bundle
5	Frame-by-frame render	Headless Chrome renders every frame, passes to ffmpeg for h264+AAC encoding
6	Job store update	Marks the job complete and exposes the MP4 at `/download/:jobId`

Reusable Component Library

Visual Components

KenBurnsImagecinematic zoom

GlowTextanimated headlines

AnimatedSubtitleword-by-word sync

FloatingParticlesambient motion

ProgressBarsection indicators

Watermarkconfigurable brand mark

Audio & Sync

AudioLayervoiceover + music mix

audioSync.tsduration distribution

Music fadeconfigurable volume

SSML pausesrespected by sync layer

Utilities

resolveAssetUrlremote + CDN resolution

Zod schemasone per template

Branding configper-request overrides

Job storein-memory queue

Adding a new template to the microservice takes hours, not days. Compose scenes from the existing components, define a Zod schema in types/common.ts, register the composition in Root.tsx, and it is available via the API. No client code changes needed.

Engineering Challenges Solved

🎬 Video Duration Synchronization

Videos must match voiceover audio length exactly. Different languages produce different audio durations for the same script.

Solution: Remotion's calculateMetadata measures MP3 duration and distributes scene timing proportionally. Each language version gets perfect sync.

🌐 Multilingual TTS Optimization

TTS engines don't handle pauses naturally. Rapid-fire speech without breathing room sounds robotic.

Solution: SSML <break> tags with strategic timing (400-900ms) optimized by AI "Script Expert" per language. Scripts are written natively, not translated.

🖼️ Image Asset Reuse Across Languages

Generating unique images per language triples cost. 5 images × 3 languages = 15 generations per topic.

Solution: Generate once, store in tagged PostgreSQL image library, reuse across all 3 language versions. 5 images serve 3 videos = 67% cost reduction.

🛠️ Docker ARM64 on Oracle Free Tier

Remotion needs headless Chrome + ffmpeg. Default images are x86 and huge. Oracle's free ARM VM demanded a lean ARM64 build.

Solution: Multi-stage Docker build with ARM64 Chromium and static ffmpeg. Final image runs inside 1 GB RAM alongside n8n and Postgres on the same VM.

📊 Template Schema Versioning

Every template accepts different props. A malformed payload mid-render wastes minutes and leaves broken MP4s in the job store.

Solution: Zod schema per composition, validated at POST time. Invalid payloads fail instantly with a precise error path. No render cycles burned on bad input.

🔃 Render Queue Without Redis

Job queues usually need Redis or a worker framework. Overkill for a single-VM microservice handling dozens of jobs/day.

Solution: In-memory job store with jobId tracking and status polling. Crashes are rare and the orchestrator retries. Simpler, cheaper, enough for the scale.

💡 This microservice pattern — HTTP API + Remotion + containerized render worker — drops into any content pipeline. n8n, Zapier, Make, or a custom backend can all drive it.

Results & Metrics

200+

Videos rendered
per month

Reusable
templates

Languages
(ES / EN / PT)

∞

Total infra cost
(Oracle free tier)

Throughput & Reuse

Production Output

Short-form vertical (9:16)daily cadence

Long-form horizontal (16:9)weekly cadence

PlatformsYouTube, TikTok, Instagram

LanguagesES / EN / PT natively

Cost Profile

Pricing modelper-render or subscription

ComputeARM-friendly, horizontally scalable

Variable costLLM + TTS + image gen API usage

Fixed costmodest single-VM baseline

Deliverables

🎬

Remotion MicroserviceNode.js + Express + Remotion v4, 6 HTTP endpoints, Zod-validated payloads, in-memory job store
🛠️

5 Production TemplatesArticleYT (16:9), DreamShort (9:16), HoroscopeWeekly (9:16), DreamThumbnail, HoroscopeThumbnail — each with its own schema
📦

Reusable Component LibraryKenBurnsImage, AudioLayer, GlowText, AnimatedSubtitle, FloatingParticles, ProgressBar, Watermark
🔑

Docker ARM64 ImageMulti-stage build, headless Chromium + static ffmpeg, runs on Oracle Cloud free tier alongside n8n
📚

Integration GuideFull README, example n8n workflow JSON, curl recipes for every endpoint, troubleshooting runbook

Technology Tags

Remotion v4 Node.js Express TypeScript Zod Docker ARM64 Oracle Cloud Headless Chrome ffmpeg DeepSeek Google Cloud TTS GPT Image n8n HTTP API Microservice Video Automation

Need an AI Content System That Runs Itself?

I build production-grade content automation systems — from AI video pipelines to multilingual email marketing. End-to-end delivery, zero middleware.

💬 Hire Me on Contra ← Back to portfolio

What I Build

Voice AI Agents

Inbound receptionistsAny language

Outbound sales agentsCold + warm

Appointment bookingReal-time CRM

Emergency routingLive transfer

Content Automation

AI content generationMultilingual

Blog & social publishingMulti-platform

Email drip campaignsFull funnel

Trend intelligenceAutomated

Video Production

AI script writingNative multilingual

AI image generationGPT Image

TTS voiceoverWaveNet voices

Remotion renderCustom templates

Infrastructure

N8N workflowsSelf-hosted

PostgreSQL + RedisProduction-grade

Monitoring & alerts24/7

DocumentationRunbooks + guides

✅ Based in Honduras (UTC-6). Available for async collaboration across all timezones. Fluent in English, Spanish, and French.

200+ Videos Per Month, Fully Automated — Script to Published in Minutes

n8n Generates Assets

POST /render

Remotion Orchestration

Render Pipeline

GET /download/:jobId

Multi-Language Reuse

Video Engine

Assets Pipeline

Infrastructure

Visual Components

Audio & Sync

Utilities

🎬 Video Duration Synchronization

🌐 Multilingual TTS Optimization

🖼️ Image Asset Reuse Across Languages

🛠️ Docker ARM64 on Oracle Free Tier

📊 Template Schema Versioning

🔃 Render Queue Without Redis

Production Output

Cost Profile

Need an AI Content System That Runs Itself?

Voice AI Agents

Content Automation

Video Production

Infrastructure