Production-grade · v1

Video Editing Agent

A production system that automates complex video editing workflows, built on a single hard rule: separate thinking from doing. The model plans, a deterministic worker executes, and every decision is written down.

Deterministic executionLLM planningTrace-first

Section 01

Summary & core philosophy

Many agentic systems lean on fuzzy state management. This one does the opposite: the orchestrator does all the thinking, the worker does all the doing, and the boundary between them is a structured plan rather than a guess.

LLM planning

A high-level orchestrator (GPT-4o or similar) reads the raw footage and the user's brief, then writes a structured EditPlan: a sequence of discrete, deterministic operations.

Deterministic worker

A Python worker consumes the EditPlan and executes it with FFmpeg and MoviePy. When a command fails, it reports the exact error straight back to the orchestrator.

No VectorDB for state

State lives in PostgreSQL with JSONB, not a vector store. A project's whole state is a serialized tree of decisions: reproducible, debuggable, and auditable.

Section 02

Database schema

Drizzle ORM manages the PostgreSQL schema, designed for high observability and strict data integrity. Four tables hold the entire lifecycle of a job.

Table	Role	Key columns
`jobs`	Top-level edit request and its lifecycle status.	status, edit_plan_id
`assets`	Source media: raw footage, b-roll, and audio.	asset_type, metadata
`edit_plans`	The structured LLM output, versioned per job.	plan_data, version
`traces`	Per-step observability for every FFmpeg call.	trace_id, ffmpeg_args

schema.ts

1import { pgTable, uuid, varchar, text, timestamp, jsonb, integer, pgEnum } from 'drizzle-orm/pg-core';2 3export const jobStatusEnum = pgEnum('job_status', ['pending', 'processing', 'completed', 'failed', 'review_required']);4 5// Core Jobs Table6export const jobs = pgTable('jobs', {7  id: uuid('id').primaryKey().defaultRandom(),8  userId: uuid('user_id').notNull(),9  title: varchar('title', { length: 255 }).notNull(),10  status: jobStatusEnum('status').default('pending').notNull(),11  editPlanId: uuid('edit_plan_id').references(() => editPlans.id),12  createdAt: timestamp('created_at').defaultNow().notNull(),13  updatedAt: timestamp('updated_at').defaultNow().notNull(),14});15 16// Assets Table17export const assets = pgTable('assets', {18  id: uuid('id').primaryKey().defaultRandom(),19  jobId: uuid('job_id').references(() => jobs.id).notNull(),20  storageUrl: text('storage_url').notNull(),21  assetType: varchar('asset_type', { length: 50 }).notNull(), // 'raw_footage', 'b-roll', 'audio'22  metadata: jsonb('metadata').notNull(), // duration, resolution, fps, bitrate23  createdAt: timestamp('created_at').defaultNow().notNull(),24});25 26// Edit Plans Table (The LLM Output)27export const editPlans = pgTable('edit_plans', {28  id: uuid('id').primaryKey().defaultRandom(),29  jobId: uuid('job_id').references(() => jobs.id).notNull(),30  planData: jsonb('plan_data').notNull(), // Array of instructions: cut, transition, overlay31  version: integer('version').default(1).notNull(),32  llmModel: varchar('llm_model', { length: 50 }),33  promptTokens: integer('prompt_tokens'),34  completionTokens: integer('completion_tokens'),35  createdAt: timestamp('created_at').defaultNow().notNull(),36});37 38// Observability & Trace Logs39export const traces = pgTable('traces', {40  id: uuid('id').primaryKey().defaultRandom(),41  jobId: uuid('job_id').references(() => jobs.id).notNull(),42  traceId: varchar('trace_id', { length: 100 }).notNull(),43  stepName: varchar('step_name', { length: 255 }).notNull(),44  ffmpegArgs: text('ffmpeg_args'),45  renderTimeMs: integer('render_time_ms'),46  logs: text('logs'),47  payload: jsonb('payload'), // Input/Output snapshots48  isFailure: varchar('is_failure', { length: 5 }).default('false'),49  createdAt: timestamp('created_at').defaultNow().notNull(),50});

Section 03

Video evaluation pipeline

An automated scoring engine guards quality before a render is finalized. Six metrics run on every candidate edit.

01
Clarity & resolution
Verifies export settings against the source assets.
02
Punchiness
Reads cut density and audio-visual sync, like cuts landing on the beat.
03
Retention scoring
LLM analysis of the first 3 seconds (the hook) and overall narrative pacing.
04
Caption readability
OCR checks and contrast-ratio validation for burned-in captions.
05
Brand fit
Color-histogram analysis and watermark / logo placement checks.
06
Watchability
Audio normalization (LUFS) and flicker detection.

Golden renders`score > 0.85`

A Golden Render is an edit that passed every evaluation gate. These are tagged in storage and used as ground truth for future style training, or shipped as the final deliverable.

Section 04

Observability & tracing

A trace-first architecture makes complex FFmpeg pipelines debuggable. Every job carries a root trace_id; every sub-task inherits it and adds its own span.

Trace ID lifecycle

1
Job created
Every job is assigned a root trace_id at creation.
2
Sub-tasks inherit
Whisper transcription, FFmpeg cuts, and color grades inherit that trace_id and add a span_id.
3
Real-time shipping
Logs stream to the traces table as each step runs.

Logged properties

ffmpeg_args: The exact command-line string sent to the worker.
render_time: Precision timing for each filtergraph execution.
cpu_usage / gpu_usage: Resource monitoring captured per task.
failures: Full stack traces and FFmpeg stderr, stored in the logs column.

Section 05

Local-friendly stack

The whole system runs efficiently on high-end local hardware or private cloud instances. PostgreSQL doubles as the persistent queue and state store, so a worker restart never loses work.

TS orchestrator

Node.js · TypeScript

API, planning, and state management on Drizzle + PostgreSQL.

Python / FFmpeg worker

subprocess · moviepy

FFmpeg via subprocess; MoviePy for complex timeline composition.

Whisper local

faster-whisper

Local NVIDIA GPUs, or CPU with OpenVINO, for high-speed transcription.

Redis queue

BullMQ · RQ

Reliable task distribution across TypeScript and Python workers.

JSONB state

PostgreSQL

Persistent queue and state store, so no work is lost on worker restart.

Video Editing Agent

Summary & core philosophy

LLM planning

Deterministic worker

No VectorDB for state

Database schema

Video evaluation pipeline

Clarity & resolution

Punchiness

Retention scoring

Caption readability

Brand fit

Watchability

Golden rendersscore > 0.85