Blue Yonder
AI

AI Engineer - Reinforcement Learning

Blue Yonder · Paris, A8, FR

Actively hiring Posted 4 months ago

About the AI Studio

The AI Studio's mission is to find the fastest possible path to an autonomous supply chain. We're developing AI agents, learning systems, training models, and more to overcome the biggest challenges remaining in the global supply chain.

In short, we are having a lot of fun.

Your mission in this role

We're looking for an ambitious AI Engineer specialising in Reinforcement Learning to work on environments, evaluations, data pipelines, and tooling for robust training systems.

You'll help shape how we approach reward modeling, environment design, and agent training. If you're energised by pushing the boundaries of what’s possible, this is your chance.

Responsibilities:

  • Design and implement RL environments for supply chain decision-making

  • Develop reward functions that capture what "good" looks like for our agents

  • Create evaluation frameworks to measure agent performance and catch failure modes

  • Build data pipelines for training and human feedback collection

  • Document what works (and what doesn't) so we can compound our learnings

  • Stay on top of industry trends and cutting edge use cases

We want to talk if you:

  • You've trained or fine-tuned LLMs
  • Are excited about AI-assisted tools and getting the most out of them

  • Build & customize your own AI workflows

  • Have experience working with AI agents and RL environments in production

  • Are proficient in Python and PyTorch

  • Can balance research exploration with shipping working code

  • Hands on experience with RL techniques (reward shaping, policy optimization, RLHF)

  • Thrive in fast-moving environments where priorities shift

  • Care about craft in your work

  • Are curious about why things work, not just that they work

Bonus points if:

  • You have experience with human-in-the-loop ML systems

  • You've built evaluation frameworks for open-ended tasks

  • You're familiar with supply chain, logistics, or operations domains

  • You have a side project that shows you can't stop tinkering

Our Values

If you want to know the heart of a company, take a look at their values. Ours unite us. They are what drive our success – and the success of our customers. Does your heart beat like ours? Find out here:

Core Values

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.

Tags & focus areas

Used for matching and alerts on DevFound
Fulltime Ai Ai Engineer Robotics
Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.