Chemify Ltd
AI

Senior ML Infrastructure Engineer

Chemify Ltd · Glasgow, SCT, GB

Actively hiring Posted 24 days ago

About Chemify

Chemify is revolutionising chemistry. We are creating a future where the synthesis of previously unimaginable molecules, drugs, and materials is instantly accessible. By combining AI, robotics, and the worlds largest continually expanding database of chemical programs, we are accelerating chemical discovery to improve quality of life and extend the reach of humanity.

Job Description

We are hiring a Senior ML Infrastructure Engineer to build, enable and operate the core platform that powers Chemifys machine learning and scientific AI computing workloads. This role sits at the intersection of distributed systems engineering, machine learning infrastructure, scientific computing, and platform engineering.

You will build and operate the operational backbone of the ML platform, ensuring that pipelines run reliably across Kubernetes clusters, onpremise GPU infrastructure, and serverless compute environments. The systems you build will support ML engineers and computational chemists running workloads from largescale model training to molecular simulation.

If you enjoy building complex technical systems at the intersection of ML and scientific computing, working on platform problems that combine distributed systems, cloud and onpremise GPU infrastructure, and real-world scientific workloads, youll thrive here.

Key Responsibilities

  • ML Pipeline Orchestration: implement routing logic dispatching workloads to appropriate compute backends; maintain workflow reliability including retries, dependency management, and failure recovery.
  • ML Pipeline Orchestration: implement routing logic dispatching workloads to appropriate compute backends; maintain workflow reliability including retries, dependency management, and failure recovery.
  • Linux Administration: Server administration and support including security and scaling.
  • Kubernetes Platform Operations: Operate clusters for ML training, inference, and batch workloads; maintain container build pipelines and GitOps deployment workflows; optimise cluster scheduling, autoscaling, and GPU utilisation.
  • HPC / GPU Compute Integration: Integrate orchestration systems with HPC job schedulers; maintain execution paths for workloads running on GPU clusters; ensure artifacts and results from HPC jobs are captured and versioned.
  • Model & Experiment Lifecycle: Operate model registry and experiment tracking platforms; ensure training runs are reproducible and linked to code and datasets; support promotion of models from staging to production.
  • Data Versioning & Pipeline Traceability: Implement dataset versioning and lineage tracking across ML pipelines; ensure predictions are traceable to model versions and datasets; maintain reproducible ML training pipelines.
  • Platform Tooling & Developer Experience: Develop platform CLI tools and pipeline templates; maintain base container images used for ML workloads; improve developer workflows for ML engineers and scientists.
  • Observability, Security & Governance: Implement monitoring, logging, and alerting across orchestration systems; maintain infrastructure as code for platform resources; ensure workloads are traceable to source code, container images, and execution environments.

What Youll Bring

  • Degree in Science, Engineering or related field (or equivalent practical experience).
  • Strong Python engineering skills.
  • Experience operating workflow orchestration platforms.
  • Strong Kubernetes platform experience.
  • Experience with containerisation and CI/CD pipelines.
  • Experience with cloud infrastructure such as AWS & GCP.
  • Experience operating distributed systems in production.
  • Strong Linux systems engineering skills.

Beneficial Skills

  • Argo Workflows or Kubernetes workflow engines.
  • SLURM or other HPC job schedulers.
  • ML experiment tracking tools such as Weights & Biases or MLflow.
  • Data versioning or lakehouse technologies such as LakeFS, Iceberg, or Delta Lake.
  • Scientific computing environments.
  • Internal developer platform or CLI tooling experience.
  • Experience in Cyber Security and operating in regulated environments.

**Why Join Chemify?

Impact:**

You will lead a team of Lab Coordinators and Production Technicians responsible for maintaining and improving the lab space at the first Chemifarm, laying the foundation for how future Chemifarms will operate.

Autonomy:

Reporting to the Inventory and Materials Manager, you will own the lab operations strategy, improve processes and workflows, and have meaningful impact over the at how Chemify labs run and operate.

Ambition:

We are a Series B deep-tech company investing in world-class infrastructure and tackling problems at the frontier of AI, robotics, and chemistry. You will have the resources, the data, and the mandate to shape how laboratories at Chemfiy run and are maintained.

Tags & focus areas

Used for matching and alerts on DevFound
Ai Machine Learning Mlops Robotics

Next step

Ready to Join the Team?

Apply once with DevFound. We'll route your profile to Chemify Ltd and keep you informed when matching AI roles go live.

  • Single profile, multiple curated AI opportunities
  • No spam roles — only vetted AI positions
  • You choose which roles to apply to
Sign up to apply

No CV uploads. We never share your profile without your consent.

Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.