← Blog·Engineering··9 min

Behind the scenes: how our multi-shot engine works

A peek at the rendering pipeline, why we chose browser automation for now, and what is coming next.

MF
Mohamed F.
Engineering

People keep asking how our multi-shot engine actually renders videos. Short answer: it is duct tape, puppeteer, and a lot of careful engineering. Long answer below.

The problem

When we started, no video-generation API offered multi-shot in a single render. Runway had it in their web UI, but not in their API. Every other provider forced us to render shots individually and stitch them — which broke visual consistency and doubled our costs.

The pragmatic choice: browser automation

Instead of waiting for APIs to catch up, we built a browser automation layer that drives Runway’s web interface directly. Each job spins up a headless Chrome instance, logs into a pre-authorized account, fills the storyboard UI, clicks render, and waits.

Client → POST /api/jobs (SQLite record)
       → startJobWorker(jobId)
           → execFile("node lib/runway-browser.mjs")
               → Puppeteer + stealth plugin
                   → fills scenes in Runway Multishot UI
                   → triggers Generate
                   → polls DOM for <video> element
                   → downloads MP4
           → uploads to Cloudflare R2
           → updates job status = completed

The queue

Jobs are stored in a SQLite file with status fields: pending, processing, completed, failed. A background worker polls for pending jobs and runs them through the browser automation. Because SQLite is just a file and the worker is a long-running Node process, we survive restarts — on boot, we requeue anything left in-flight.

This is the reason you can close your browser tab during a render. The job is server-side state; your tab is just a viewer polling a public endpoint for progress.

What we learned the hard way

  • Headless Chrome is memory hungry. 1GB per instance is normal. Hetzner VPS works; Vercel serverless does not.
  • Google OAuth hates automation. Stealth plugin + persistent profile kept our session alive. Anti-bot was a year-long arms race.
  • File cleanup matters. After upload to R2, we delete the local file. Without this, 35GB disk fills in a week.
  • Status polling needs backoff. First iteration polled every 1s and DDoS'd ourselves. 5s feels right.

What is next

We are migrating to a proper message queue (BullMQ + Redis) to handle concurrency across multiple worker pools. That opens the door to real priority tiers — not just "faster queue" but "dedicated hardware" for Studio customers.

Longer-term, we are training our own multi-shot model. Browser automation was the right choice to launch in 3 weeks. It is not the right choice to scale to a million videos per month. When we flip the switch, the change will be invisible to users — the API stays the same, just faster and cheaper.

Build the thing that works in days. Then replace the hacks while users use the product.

Ready to make your own short?

Turn an idea into a cinematic multi-shot video in minutes.

Start creating