Own the stack that turns the world's
documents into intelligence.
Reducto is the document ingestion layer for leading AI teams — turning complex documents into structured data that LLMs can reliably use. We're hiring an ML Infrastructure Engineer to own the training and inference systems behind it.
The vast majority of enterprise data is trapped in documents.
Financial statements, medical records, insurance forms, SEC filings — the information that runs the economy lives in PDFs, spreadsheets, and scans that LLMs can't reliably read. Traditional OCR breaks on real-world complexity.
Reducto fixes the bottleneck. By combining layout-aware computer vision with vision-language models, Reducto reads documents the way a human would — capturing tables, figures, and meaning with near-perfect accuracy, then correcting its own mistakes in real time. Hundreds of companies, from fast AI startups to a Fortune 10 enterprise, now run their most critical document workflows on Reducto. The models powering all of it need infrastructure that never becomes the bottleneck. That's the job.
Small team. Real ownership. Production from day one.
You own it end to end
This is a high-impact IC role. You take the training and inference stack and make it yours — no layers, no hand-offs, no infrastructure becoming someone else's problem.
You work beside the researchers
Sit shoulder-to-shoulder with the ML team. Your pipelines and tooling are what let them move from experiment to production fast — your work directly drives model accuracy, latency, and cost.
You ship fast, in person
Five days a week together in San Francisco, in one of the hottest spaces in AI infrastructure. Shipping speed is the culture, not the exception.
Built by engineers who've lived this problem.
- MIT — met co-founder Raunak there
- Previously at Google before founding Reducto
- Started Reducto in 2023 after seeing OCR fail on real enterprise documents firsthand
- Scaled the company to ~75 people and a Fortune-10 customer base in under three years
- MIT — deep technical roots in ML and systems
- Previously engineering at NVIDIA
- Architected Reducto's OCR + VLM document system from the ground up
- Leads a technical culture that prizes accuracy, speed, and real ownership
The investors building the AI era are behind Reducto.
A $75M Series B led by a16z brought total funding to $108M — the kind of conviction capital that lets a 75-person team hire the best and build for the long term.
Own the training & inference stack.
ML Infrastructure Engineer
A high-impact IC role for a strong generalist who understands how ML models actually work — from serving and monitoring to data pipelines — and can make inference faster, more reliable, and cheaper. You'll work directly with ML researchers to get models deployed quickly and reliably, and make sure infrastructure is never the bottleneck for the products customers like Harvey and Scale depend on.
What you'll do
- Build and maintain model serving infrastructure — improving inference speed, monitoring, and reliability
- Stand up and improve training infrastructure for models from 300M to 30B parameters across 1–3 node environments
- Develop observability, logging, and monitoring across the ML stack
- Build internal data pipelines and tooling so researchers move faster from experiment to production
- Architect inference arbitration across multiple cloud providers, optimizing for accuracy, latency, and cost
What we're looking for
- 3+ years in ML infrastructure, focused on model serving and training infrastructure
- Built and run ML serving/inference infra at a company with a high talent bar
- Strong in Python + systems engineering with Kubernetes; expert on GPU bottlenecks and serving optimization
- AI-native from the start — ideally from a startup that trains its own models (ElevenLabs, Cursor, Cognition, Cartesia, Fireworks, and the like)
- Bonus: top technical program (MIT / Stanford / Caltech), ML-systems research pedigree (e.g. Hazy / Christopher Ré, Tim Kraska), publications at NeurIPS/ICML/ICLR/CVPR, strong open-source work, or math/CS competition pedigree (IOI / IMO / Putnam)
The interview process is built to find exactly that: a recruiter intro, a technical phone screen, a hands-on ML debugging & coding session, a conversation with the team, and a final on-site.
The timing, the team, and the surface area.
Right moment, hottest space
AI document infrastructure is exploding, and Reducto is the category leader. Joining at 75 people post-Series B is the rare on-ramp before scale.
Total ownership
You own the stack, not a ticket queue. Your work moves accuracy, latency, and cost for real customers immediately.
Proof, not hype
8x revenue growth last year, hundreds of customers from AI-native startups to Fortune 10 enterprises, and a $75M Series B led by a16z. The traction is real.
A bar worth clearing
MIT-founded, ex-Google/NVIDIA, backed by a16z and Benchmark. You'll work with people who push you.
Comp that respects talent
$200K–$300K base for ML Infra — and they'll exceed it for the right person — plus competitive equity in a fast-rising company.
Build, don't maintain
Greenfield infrastructure powering production document workflows at scale. You're laying track, not patching legacy.
Let's have a 15-minute conversation.
No pressure and no resume needed to start — just a quick call to talk through the role, the team, and whether the timing is right for you. I'm managing this search directly.