Video is 80% of the internet.
Someone has to make sense of it.

Sieve is the only AI research lab exclusively focused on video data. They combine exabyte-scale infrastructure, novel video understanding techniques, and dozens of data sources to build training datasets that push the frontier of AI — and they did $XXM last quarter with a team of just 12.

80%

of internet traffic
is video

$XXM

Revenue last quarter
12-person team

Series A

Matrix, YC,
Swift, AI Grant

2022

Founded — fast-moving
from day one

The Problem

Frontier AI is bottlenecked by video training data.

Video generation, multimodal models, robotics, AR/VR — all require high-quality, precisely curated video at massive scale. Today, AI labs spend months manually cobbling together pipelines to collect and filter it. The infrastructure doesn't exist yet. Sieve is building it.

The Solution

📡

Exabyte-Scale Infrastructure

Pipelines to acquire, process, and index video at internet scale — built from first principles.

🎯

Precision Video Understanding

Novel CV, audio, and text techniques to filter for quality with high precision at scale.

🔗

Multi-Source Data Pipelines

Dozens of sources combined into frontier-ready training sets, delivered to top AI labs.

Why Now · Why Here

The right problem at the right moment.

You're not optimizing someone else's legacy system. You're building the data infrastructure that doesn't exist yet — for the AI applications that will define the next decade.

Real revenue, tiny team

$XXM last quarter with 12 people. This is product-market fit — not a science project.

Research autonomy

Ambiguous problems, clever solutions. You won't be handed tickets — you'll define the approach.

Direct impact on frontier models

Your pipelines feed the models that matter. Short feedback loops, top AI labs as customers.

Visa sponsorship available

H-1B, O-1, and OPT supported. Competitive comp + equity: $150K–$350K range.

The Team

Mokshith Voodarla

Co-Founder & CEO

Former Scale AI, where he was embedded in the data pipelines powering some of the world's most capable models. Brings rare fluency in both data infrastructure and the needs of frontier AI labs — knows what they need before they ask for it.

Scale AI Data Infrastructure UC Berkeley

Abhinav Ayalur

Co-Founder & CTO

Spent his career where computer vision meets the real world — at Niantic, building AR systems at scale for millions of users, and at Second Spectrum, turning broadcast footage into analytical intelligence for the NBA and Premier League. Now applying that expertise at exabyte scale.

Niantic Second Spectrum Computer Vision

Open Role

Engineering

Applied Research Engineer

Build high-performance building blocks and large-scale pipelines to understand video at internet scale. You'll work across computer vision, audio, and text processing — squeezing every drop of performance through clever pre/post-processing, parallelism, pipelining, and inference optimization.

🏢 In-Person · SF 💰 $150K–$350K + Equity 🛂 Visa Sponsored ⚡ Full-Time

You're a strong fit if you...

Have 2+ years in computer vision or audio processing
Are a strong Python developer with hands-on PyTorch experience
Communicate well with customers and external teams
Write clean, maintainable code — active GitHub a plus
Are motivated by end-to-end product ownership, not just model training
Can break a customer problem down into the right technical building blocks

Schedule a quick conversation →

Video is 80% of the internet. Someone has to make sense of it.

Frontier AI is bottlenecked by video training data.

The right problem at the right moment.

Video is 80% of the internet.
Someone has to make sense of it.