Gridware
AI

Senior ML Infrastructure Engineer

Gridware · San Francisco, CA

Actively hiring Posted 6 months ago

Responsibilities

  • Design, build, and maintain the infrastructure, tooling, and workflows that enable reliable, scalable deployment of ML models to production.
  • Develop monitoring and observability systems to track model performance, data drift, data quality, and overall system health.
  • Create and maintain end-to-end testing frameworks and simulation environments to validate models and pipelines prior to deployment.
  • Work closely with Data Engineering and Platform Engineering teams to ensure ML systems integrate cleanly with broader Gridware infrastructure and operational standards.
  • Improve CI/CD pipelines for ML workloads, ensuring reproducibility, safe rollout, and automated rollback strategies.
  • 5+ years of experience building production ML infrastructure
  • Strong software engineering skills and proficiency in Python
  • Experience with cloud platforms (AWS) and container orchestration (Kubernetes)
  • Familiarity with feature stores, model registries, or centralized metadata systems (i.e. MLFlow)

Benefits

Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)

Paid parental leave

Alternating day off (every other Monday)

“Off the Grid”, a two week per year paid break for all employees.

Commuter allowance

Company-paid training

About the company

Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid. We pioneered a groundbreaking new class of grid management called active grid response (AGR), focused on monitoring the electrical, physical, and environmental aspects of the grid that affect reliability and safety. Gridware’s advanced Active Grid Response platform uses high-precision sensors to detect potential issues early, enabling proactive maintenance and fault mitigation. This comprehensive approach helps improve safety, reduce outages, and ensure the grid operates efficiently. The company is backed by climate-tech and Silicon Valley investors. For more information, please visit 
www.Gridware.io
.

Role Description

As a Senior ML Infrastructure Engineer, you will work directly in the Automation org with the core ML, Ops, and Analytics teams to help improve and build out the infrastructure around model deployment and monitoring. This role is essential to helping scale out the amount of time saving’s Gridware brings to customers.

Tags & focus areas

Used for matching and alerts on DevFound
Fulltime Machine Learning Ai
Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.