Inference, elegantly engineered

Careers

Build the endpoint production AI teams stop worrying about.

Direct Inference makes model selection, capability handling, spend guardrails, and provider churn disappear behind one durable surface. We are hiring people who like hard infrastructure, crisp product taste, and systems that earn trust by holding less.

See open roles Reach out

Open roles

Work style

Remote / SF

Engagement

Full-time

Main focus

Forward deployed

Hiring for leverage

Small team, broad ownership, production systems, and visible customer impact.

Open positions

9 roles, all building the same promise.

Every role touches the zero-knowledge endpoint: product quality, developer trust, operational reliability, and the data loop that makes it better.

Priority roleEngineeringFull-time

Forward Deployed Engineer

View role

Work directly with high-intent customers to get production AI workloads running on Direct Inference, then bring the sharp edges back into the product and serving engine.

Embed with customers during pilots, migrations, and launch moments so one endpoint replaces brittle model-selection code.
Build reference integrations, workflow-specific tooling, and small product fixes that turn deployment friction into reusable platform improvements.
Partner with engineering on request classification, observability, spend controls, and reliability when a customer workload exposes a new edge case.
Translate customer patterns into docs, examples, product requirements, and internal operating knowledge.

Location: Remote / San Francisco, CA
Experience: 3-6 years
Compensation: $140k-$210k

EngineeringFull-time

Senior Inference Engineer

View role

Own and extend the serving engine: the quality, latency, health, and price signals that decide how every request is served.

Design and improve the decision systems that weigh capability, quality, latency, provider health, and cost for each request.
Build evaluation loops for code, reasoning, document, vision, long-context, and structured-output traffic.
Improve fallbacks and promotions so hard requests get stronger handling while simple traffic stays fast and cost-effective.
Instrument the serving path with the right internal signals while preserving the zero-knowledge product contract.

Location: Remote / San Francisco
Experience: 5+ years
Compensation: $170k-$240k

InfrastructureFull-time

Platform Reliability Engineer (SRE)

View role

Keep one endpoint dependable across a churning set of upstream providers: failover, rate-limit absorption, and the spend caps that fail closed.

Own production reliability for the endpoint, from deploy safety and health checks to incident response and post-incident hardening.
Build provider-health automation, saturation controls, and graceful degradation paths for high-volume customer traffic.
Strengthen observability around latency, errors, spend, and capacity without exposing private serving internals to customers.
Make billing and spend-limit enforcement boringly reliable, especially under abuse, provider incidents, and runaway workloads.

Location: Remote
Experience: 4+ years
Compensation: $160k-$230k

EngineeringFull-time

Full Stack Engineer

View role

Build the product and platform surfaces that make Direct Inference feel like one dependable endpoint, from API workflows to dashboard tools used by production teams.

Location: Remote / San Francisco, CA
Experience: 3-5 years
Compensation: $120k-$180k

EngineeringFull-time

Frontend Engineer

View role

Craft fast, polished interfaces for Direct Inference, including the playground, traces, billing controls, documentation chrome, and developer-facing product flows.

Location: Remote / San Francisco, CA
Experience: 2-4 years
Compensation: $100k-$150k

AI/MLFull-time

Machine Learning Engineer

View role

Advance the systems that classify requests, evaluate answer quality, and keep code, reasoning, vision, document, and structured-output traffic served by the right capability.

Location: Remote / San Francisco, CA
Experience: 3-6 years
Compensation: $150k-$220k

DataFull-time

Data Scientist

View role

Turn usage, quality, cost, and reliability signals into product decisions that improve Direct Inference without exposing private serving details.

Location: Remote / San Francisco, CA
Experience: 2-5 years
Compensation: $110k-$160k

EngineeringFull-time

Backend Engineer

View role

Build robust backend systems for API ingress, auth, billing, observability, and the zero-knowledge endpoint production teams rely on.

Location: Remote / San Francisco, CA
Experience: 3-5 years
Compensation: $120k-$170k

InfrastructureFull-time

DevOps Engineer

View role

Own the infrastructure behind a growing inference surface, from deploy safety and provider health to uptime, observability, and incident response.

Location: Remote / San Francisco, CA
Experience: 3-5 years
Compensation: $130k-$180k

How we work

High ownership, low ceremony, real production stakes.

Close to customers

You will see how teams use the endpoint, where it earns trust, and where the product needs to get sharper.

Systems over theater

We care about the path from request to answer: quality, cost, latency, privacy, and the operational details that keep it boring.

Tiny surface, deep work

The customer sees one endpoint. Behind it is a lot of careful engineering, product judgment, and data-informed iteration.

Open applications

Do not see the exact role? Tell us where you would raise the ceiling.

We are always interested in people who can make Direct Inference more reliable, easier to adopt, cheaper to operate, or clearer to trust.

Get in touch Learn about Direct Inference