Distributed Systems Engineer
San Francisco, CA (Onsite)

About the Company

A fast-moving AI research group is building the core video data infrastructure used by leading AI labs and major tech companies. The team is small at around fifteen people, nearly all engineers, and recently pivoted to focus exclusively on high-quality video data at massive scale.

The shift has driven significant revenue growth, and they are now planning to expand the team steadily over the next few months.

The culture is straightforward: engineering led, product focused, low ego, and built around people who enjoy ownership. They work in person five days a week in their San Francisco office, moving quickly, solving hard problems, and avoiding micromanagement.

The Role

This position focuses on designing and scaling distributed systems that support huge ML and ETL workloads across petabytes of video. You will own core infrastructure: compute scheduling, orchestration, throughput, reliability, cost efficiency, and the internal tooling that keeps the entire engineering group moving at pace.
The company is beginning to scale its infrastructure footprint aggressively, and this role will become central to that growth. It is a hands-on IC position suited to someone who has operated critical systems before and wants to shape the foundation of a rapidly expanding platform.

What You’ll Work On

• Architect and scale distributed systems for large-scale ML and ETL workloads
• Build compute orchestration and scheduling across thousands of GPUs
• Improve uptime, resilience, and execution speed of high-volume data pipelines
• Design pipelines capable of handling petabyte-level video datasets
• Lead the development of CI/CD and internal tooling for fast iteration
• Partner closely with research engineers delivering new video models and algorithms
• Operate in a high-trust environment with strong autonomy and clear ownership

Requirements

• 3+ years building foundational distributed systems or data infrastructure
• Experience running critical systems at significant scale
• Proficient across cloud architectures
• Strong coding experience with Go (preferred) and Python
• Background building or maintaining large-scale pipelines
• Experience with ML-focused CI/CD and automation
• Video domain experience is not required
• Operates as a strong IC who leads through action
• Fully onsite in San Francisco, Monday to Friday

Culture Fit

• Enjoys ambiguity, problem discovery, and self-direction
• Communicates clearly and concisely
• Shows strong intellectual curiosity
• Low ego, collaborative mindset
• Motivated by building core systems in a small, high-caliber team

Red flags include weak communication, low curiosity, or unclear motivation for the domain.

Interview Process

Intro call focused on culture, curiosity, and communication
Technical discussion on background and complexity of past work
Problem-solving session with a research engineer
Onsite research problem and collaboration exercise

Distributed Systems Engineer

APPLY HERE