Notion
AI

Software Engineer, Agent Dev Velocity

Notion · California San Francisco United States · $214k - $300k

Actively hiring Posted about 1 month ago

About Us:

Notion helps you build beautiful tools for your life’s work. In today's world of endless apps and tabs, Notion provides one place for teams to get everything done, seamlessly connecting docs, notes, projects, calendar, and email—with AI built in to find answers and automate work. Millions of users, from individuals to large organizations like Toyota, Figma, and OpenAI, love Notion for its flexibility and choose it because it helps them save time and money.

In-person collaboration is essential to Notion's culture. We require all team members to work from our offices on Mondays, Tuesdays, and Thursdays, our designated Anchor Days. Certain teams or positions may require additional in-office workdays.

About the Role:

Agent Dev Velocity builds the tooling and evaluation backbone that helps Notion ship high-quality AI faster and more safely. We build the infrastructure that makes AI evaluations easy to create, cheap to run, and hard to ignore, so engineers across the AI org can iterate with confidence.

In this role, you will work at the intersection of developer tooling, distributed systems, and measurement. You will build systems for running and maintaining evals at scale, and you will help create durable benchmarks and datasets that keep us honest about quality over time.

You will help evolve evals into a system, by enabling reusable eval workspaces and data-driven workflows that surface issues through data mining and continuous measurement.

What You'll Achieve:

  • Build and improve scalable eval runners and harnesses that work locally, in CI, and on scheduled runs.

  • Make it easy for engineers to add high-signal evals: better templates, fixtures, debugging tools, and clear workflows.

  • Build and maintain benchmark and dataset tooling (curation pipelines, versioning, artifact management, and regression tracking).

  • Improve reliability and observability for eval execution (retries, idempotency, cost and latency visibility, and failure triage).

  • Partner closely with AI product, AI platform, and infrastructure teams to integrate evals into day-to-day shipping workflows.

Skills You'll Need to Bring:

  • Strong software engineering fundamentals and experience shipping production systems.

  • Proficiency with TypeScript/Node and/or Python.

  • Experience building reliable systems in distributed environments (queues, retries, idempotency, and backfills).

  • Comfort working with data pipelines (batch processing, data quality, versioning, and reproducibility).

  • Practical experience designing measurement or evaluation systems (LLM eval experience is a plus, but strong testing and benchmarking instincts also apply).

  • You don’t need to be an AI expert, but you’re curious and willing to adopt AI tools to work smarter and deliver better results.

Nice to Haves:

  • Experience building developer tooling (CLI tools, CI integrations, or internal platforms).

  • Familiarity with LLM evaluation techniques (rubrics, human review loops, dataset curation, and regression detection).

  • Experience collaborating across teams to roll out new workflows and drive adoption.

We hire talented and passionate people from a variety of backgrounds because we want our global employee base to represent the wide diversity of our customers. If you’re excited about a role but your past experience doesn’t align perfectly with every bullet point listed in the job description, we still encourage you to apply. If you’re a builder at heart, share our company values, and enthusiastic about making software toolmaking ubiquitous, we want to hear from you.

Notion is proud to be an equal opportunity employer. We do not discriminate in hiring or any employment decision based on race, color, religion, national origin, age, sex (including pregnancy, childbirth, or related medical conditions), marital status, ancestry, physical or mental disability, genetic information, veteran status, gender identity or expression, sexual orientation, or other applicable legally protected characteristic. Notion considers qualified applicants with criminal histories, consistent with applicable federal, state and local law. Notion is also committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, please let your recruiter know.

Notion is committed to providing highly competitive cash compensation, equity, and benefits. The compensation offered for this role will be based on multiple factors such as location, the role’s scope and complexity, and the candidate’s experience and expertise, and may vary from the range provided below. For roles based in San Francisco or New York City, the estimated base salary range for this role is $214,000 - $300,000 per year.

By clicking “Submit Application”, I understand and agree that Notion and its affiliates and subsidiaries will collect and process my information in accordance with Notion’s Global Recruiting Privacy Policy and NYLL 144.

#LI-Onsite

Tags & focus areas

Used for matching and alerts on DevFound
Dev Engineer Node Typescript Openai Python
Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.